Agile database schema migration tool for Java

Update: Grails Migration Plugin Forum

About a month into building Gauntlet we found ourselves in a situation where it was impossible for us to keep our development databases up to date with the latest changes to the schema. We were sending around emails telling each other what changes need to be made alongside our check-ins. In order to get around this problem I spent a few hours building a rudimentary database schema migration tool. When you made a change to the database schema you would have to build a DDL file that would make the change and then update the version numbers for the Gauntlet software and within the database. A few short months later and I discovered that we had built something very much like ‘rake migrate’ from Ruby on Rails — in fact it was almost exactly the same except that ours worked at runtime rather than only from the command-line. Fast forward a couple years later and I no longer own the Gauntlet source code, so yesterday I set out to rebuild the schema migration tool from scratch

The basic concept is very simple. The first time your program connects to its database it calls the schema migration code to make sure that the code and the database are at the same version so that you are always using data objects and queries that match the schema that is present in the database. When you make the call you need to pass the tool all the details about the database that you are connecting to, the place to find and classes or scripts to do the migration, and the current version of the client. Then, if the database version is less than the client version, the migration tool systematically searches for first database type specific DDL classes/scripts then generic versions, executing them in turn to migrate the schema forward until the database version and the client version match. If it encounters a case where the client version is less than the database version it has no recourse but to fail. As a bonus, it also offers a version 0 transition where it will bootstrap you from a completely empty database to your initial schema.

For instance, lets say in version 1 of your database you have the following table:

mysql> describe event;
+---------------+---------------+------+-----+-------------------+----------------+
| Field         | Type          | Null | Key | Default           | Extra          |
+---------------+---------------+------+-----+-------------------+----------------+
| id            | bigint(20)    | NO   | PRI | NULL              | auto_increment |
| time          | timestamp     | NO   |     | CURRENT_TIMESTAMP |                |
| z             | bigint(20)    | NO   | MUL |                   |                |
| ip_address    | int(11)       | YES  |     | NULL              |                |
| user_agent_id | int(11)       | YES  |     | NULL              |                |
| referrer      | varchar(256)  | YES  |     | NULL              |                |
+---------------+---------------+------+-----+-------------------+----------------+

Then you decide that you want to change the name of the referrer field to url. In order to do that you would create a new migration script that updates the field name and the database version:

ALTER TABLE event CHANGE referrer (url varchar(256));
UPDATE db_version SET version = 2;

You would name that script migrate1.sql and put it in the mysql specific database migration scripts. You would also then update the client version to 2 as well. Once that was done anyone who uses the new client code against their database will automatically get the schema changes required for the client to work with the database. This drastically cuts down on the amount of communication that needs to occur in typical database development situations. You can find the project that implements this db schema migration here. It has one dependency that is included with the project, my cli-parser