Database structure changes can be very stressful especially in production. It is a pain to migrate databases without having a formalized solution. We all are aware that keeping track of the schema changes is important and always very beneficial to reduce the overall cost of the change.
Database migration primarily involves changing the database structure at a point in time to another, resulting in a new version. Prevention of possible errors in the process, caused by any human factor, can be critical in the maintenance of database servers running different schema versions. Importantly, we should apply one schema change only once and in cases when running two or more operations at same time, we need to make sure the changes are free of race conditions.
While there are other database migrations tools available, we preferred alembic as an automated migration tool for a client application with multiple databases. We use SQLAlchemy as an Object Relation Mapper (ORM) in the application and alembic is a lightweight database migration tool for usage with SQLAlchemy. Also, alembic is developed and maintained by Mike Bayer, the author of SQLAlchemy.
Alembic can be useful for:
Multiple database support
Automatic scripts generation
Avoidance of race conditions
Scripts encapsulation in a single file
Backward or downgrade compatibility
Offline mode support
Typical alembic environment structure looks like:
project/ alembic.ini alembic/ env.py script.py.mako versions/ generated_migration_scripts.py
Alembic can be configured by providing the database driver information in the alembic.ini file or by customizing the database config in env.py so that the configurations are in sync between application and migration scripts. Alembic supports the multidb configuration to have the versions for multiple databases defined in the same environment. In our application for databases with separate schema definitions we created two different alembic environment directories, each holding their own configuration and versioned migration scripts. This way it helped us to maintain the versions separately and avoid probable schema conflicts while running them. One way or the other, it seems to be well customizable and easy to adapt as needed.
Our client application is hosted in Amazon’s Elastic Compute Cloud (EC2) with the database in Amazon’s Relational Database Server (RDS). We configured our Amazon CloudFormation script adding a new task definition to run the database migration script as a standalone task. We included alembic upgrade commands in a bash script, which are triggered by the defined AWS task. The task can be run using ‘Run Task’ operation in Amazon ECS cluster console or using AWS Command Line Interface as,
aws ecs run-task --cluster <ECSCluster_identifier>\ --task-definition <alembic taskdefinition identifier>
The setup process seems to be straightforward and well documented. Using SQLAlchemy and alembic together, we can automatically generate and customize the migration scripts and we can identify the differences between existing database tables and the defined model in code. There are some limitations of the autogenerate option that are mentioned in the documentation. Each migration is called a revision and knows what order to be run in because each revision is given a down_revision to identify its parent. The revisions range from base to head, base being the initial or stamped revision and head as the latest revision.
Adopting the simpler workflows of alembic has helped us reduce complexity and risks, save implementation time, and increase development throughput. It has provided an automated database refactoring technique which certainly gives us more control over the release process of new changes and enables continuous delivery of our product.