This project is archived and is in readonly mode.

#5883 ✓stale
Daniel Beardsley

Design of Schema.define() can cause missed migrations

Reported by Daniel Beardsley | October 29th, 2010 @ 07:43 AM

This is an architectural issue, but the way the schema.rb file only lists one version number makes it possible to end up in a state where the entries in your schema_migrations table don't match up with the migrations that have actually been applied to your database.

There are lots of ways this could happen, but here is one scenario:

  1. Two developers are working on separate branches and each creates a migration.
  2. Both merge their work back into master.
  3. One of them inevitably has a conflict with the Schema.define(:version => line within db/schema.rb
  4. Developer picks the wrong Schema.define line when merging
  5. Someone later uses rake db:schema:load and now has incorrect entries in their schema_migrations table because Schema.define assumes all prior migrations have already happened.

Depending on the scenario, this can happen both ways (schema.rb assumes incorrectly that migrations have been applied OR assumes incorrectly that migrations haven't been applied).

This Stack Overflow answer does a much better job of explaining the problem. Also see this Lighthouse ticket comment by Ryan Bates

Solution

Really, schema.rb should list all migration numbers that have already been applied so that we can guarantee rake db:migrate == rake db:schema:load

I propose changing the Schema.define method to accept :verions => [...] and properly sort and list all the entries (one per line). I'll gladly create a patch and some test cases if this is a good idea.

Comments and changes to this ticket

  • Andrew White

    Andrew White October 29th, 2010 @ 09:37 AM

    • Importance changed from “” to “Low”

    I don't think listing all the applied migrations in schema.rb is a good idea - you'll still get a merge conflict on the Schema.define line. The task db:migrate:status will show you any migrations that haven't been applied, so I'd be more in favour of removing the version line to eliminate the merge conflict.

    What I'd like to see is being able to edit the schema.rb directly and then running a rake task sync the database to the schema with an option to save the migration first for tweaking if necessary.

  • Daniel Beardsley

    Daniel Beardsley October 29th, 2010 @ 10:00 AM

    Why would there be a merge conflict? Each migration timestamp would be on its own line (and sorted):

      Schema.define(:versions => %w{
          201008110387143
          201008140387943
          201008210562792
          ...
          201008110387943
        )) do
    

    Also, merging would be easy, just include both lines.

    This Stack Overflow answer does a much better job of explaining the problem.

    rake db:migrate:status only shows migrations that aren't listed in the schema_migrations table but do exist in your db/migrate directory; If the above situation happens, there will be entries in schema_migrations for migrations that truly haven't been applied to the database.

    The :version parameter should NOT be removed; if it was, a db:schema:loaded database would have NO way of knowing which migrations have been applied (unless you just assume ALL of them).

    One of the main points of schema migrations is to create an easy way to undo changes to the database, editing the schema.rb directly and doing a text-based diff to create a migration (maybe I misunderstood your suggestion) would only allow forward changes and couldn't be used to automatically create a down step. Besides, schema.rb creates a database from scratch, it has no mechanism to "sync" an existing database.

  • Andrew White

    Andrew White October 29th, 2010 @ 11:19 AM

    Putting each one on a line will eliminate the merge conflict for the simple case but there's still a possibility of conflict when you have multiple migrations and long running feature branches. Also there is the issue of what order the migrations should be applied - date order or the schema.rb specified order. Then you've got the issue of apps with lots of migrations - I've one that started out using Rails 1.0 that now has a couple of hundred migrations. I don't want those 200 lines in my schema.rb. We should rethink the whole schema/migrations issue rather than trying to patch something that's fundamentally flawed.

    I'm not sure what you mean with the db:schema:loaded database - the db:schema:load task recreates all of the tables and doesn't check what migrations have been applied. If you mean the current database then the schema_migrations table has the list of applied migrations, but you know that.

    As for having no down step, DHH's keynote from RailsConf 2010 suggested this was the way 3.1 was heading:

    http://www.youtube.com/watch?v=b0iKYRKtAsA
    Timecode 37:34

    class CreateTolkTables < ActiveRecord::Migration
      def self.change
        table :tolk_locales do |t|
          t.string   :name
          t.datetime :created_at
          t.datetime :updated_at
        end
      end
    end
    

    I know there's no mechanism for syncing changes in ActiveRecord::Schema - I was thinking of adding some. Often the down step is irreversible and also often doesn't work because the up step didn't complete successfully. To revert to a previous version of the database you'd just revert your schema.rb and sync. You'd have options to skip destructive actions like remove_column, remove_index and drop_table and you'd also be able to sync either way.

  • Ryan Bigg

    Ryan Bigg November 8th, 2010 @ 01:54 AM

    • Tag cleared.

    Automatic cleanup of spam.

  • Adrian Irving-Beer

    Adrian Irving-Beer November 11th, 2010 @ 12:30 AM

    I stumbled across this issue by accident, but it's something I was already considering implementing, so I figured I'd share my two cents.

    In one of my projects, we use a workflow that avoids the schema.rb conflicts altogether:

    • Rename schema.rb to schema.versioned.rb in version control.
    • Do migrations as necessary. Users can bootstrap by copying schema.versioned.rb to schema.rb and applying all remaining migrations.
    • Our CI server always bootstraps from schema.versioned.rb so we always know about conflicts.
    • When the number of migrations gets high, copy your schema.rb over the schema.versioned.rb and commit.

    This introduces one problem, however: If user A creates a migration on a topic branch, then user B does the schema.versioned.rb update, and then user A merges their branch, their migration ends up getting skipped due to the higher :version (with no conflict).

    Our solution for now is that, whenever we update the schema.versioned.rb, we delete all included migrations and set the :version to zero. This prevents old migrations from being re-run, and new migrations from being skipped.

    Listing migration versions in the schema.rb (as per Daniel's solution) would be ideal here.

    Obviously we have an unsupported non-standard workflow, and I'm not suggesting that Rails cater specifically to that — but my point is that listing versions is a more complete, more flexible solution that supports various possible workflows much better than just declaring a version cutoff.

    Users always have the option to prune the version list when they prune the migrations directory. I'm not really sure how having a couple hundred lines in a (presumably already rather large) schema file is worse than having the same number of files in a directory. Especially if they were moved to the bottom instead, say.

    As for "Also there is the issue of what order the migrations should be applied - date order or the schema.rb specified order" — this is a bit of a red herring, since schema.rb doesn't apply migrations, it only declares which migrations have been applied.

  • Santiago Pastorino

    Santiago Pastorino February 11th, 2011 @ 07:18 PM

    • State changed from “new” to “open”

    This issue has been automatically marked as stale because it has not been commented on for at least three months.

    The resources of the Rails core team are limited, and so we are asking for your help. If you can still reproduce this error on the 3-0-stable branch or on master, please reply with all of the information you have about it and add "[state:open]" to your comment. This will reopen the ticket for review. Likewise, if you feel that this is a very important feature for Rails to include, please reply with your explanation so we can consider it.

    Thank you for all your contributions, and we hope you will understand this step to focus our efforts where they are most helpful.

  • Santiago Pastorino

    Santiago Pastorino February 11th, 2011 @ 07:18 PM

    • State changed from “open” to “stale”
  • bingbing

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile »

<h2 style="font-size: 14px">Tickets have moved to Github</h2>

The new ticket tracker is available at <a href="https://github.com/rails/rails/issues">https://github.com/rails/rails/issues</a>

Pages