This project is archived and is in readonly mode.

#2515 ✓resolved
Levin Alexander

db:schema:dump with PostgreSQLAdapter dumps multiple column indexes in undefined order

Reported by Levin Alexander | April 16th, 2009 @ 07:29 PM | in 2.x

When using postgres, db:schema:dump dumps indexes that span multiple columns in an undefined order.


postgres=# CREATE TABLE widgets ( name varchar, amount integer, size integer, weight integer);
CREATE TABLE
postgres=# CREATE INDEX widget_idx ON widgets(size, amount, weight, name);
CREATE INDEX
postgres=# SELECT * FROM pg_indexes WHERE tablename = 'widgets';
 schemaname | tablename | indexname  | tablespace |                                  indexdef
------------+-----------+------------+------------+-----------------------------------------------------------------------------
 public     | widgets   | widget_idx |            | CREATE INDEX widget_idx ON widgets USING btree (size, amount, weight, name)
(1 row)

$ rake db:schema:dump
$ cat db/schema.rb | grep widgets
  create_table "widgets", :id => false, :force => true do |t|
  add_index "widgets", ["amount", "name", "size", "weight"], :name => "widget_idx"

The last line is wrong, the order of the columns in the index is not preserved.

Comments and changes to this ticket

  • Max Lapshin

    Max Lapshin April 20th, 2009 @ 04:49 PM

    • Tag changed from active_record, postgres to 2.3.2, active_record, bug, index, patch, postgres, schema_dumper, test
    • Assigned user set to “Tarmo Tänav”

    I've added failing test, which really fails only for Postgresql and fixed Postgresql adapter. Now everything is OK.

  • Max Lapshin

    Max Lapshin April 20th, 2009 @ 04:55 PM

    Tarmo, this bug seems to be severe, because it can really break indexes on database.

  • Levin Alexander

    Levin Alexander April 20th, 2009 @ 06:00 PM

    The patch works for me. Thank You.

  • Anton Ageev

    Anton Ageev April 20th, 2009 @ 06:07 PM

    Confirming:

    
    $ ./script/dbconsole -p
    # \d auth_tokens
                                         Table "public.auth_tokens"
       Column   |            Type             |                        Modifiers                         
    ------------+-----------------------------+----------------------------------------------------------
     id         | integer                     | not null default nextval('auth_tokens_id_seq'::regclass)
     user_id    | integer                     | not null
     token      | character varying(255)      | not null
     created_at | timestamp without time zone | not null
    Indexes:
        "auth_tokens_pkey" PRIMARY KEY, btree (id)
        "index_auth_tokens_on_user_id_and_token" UNIQUE, btree (user_id, token)
    
    
    $ rake db:schema:dump
    (in /home/antage/workspace/avaxhome)
    $ cat db/schema.rb | grep auth_tokens
      create_table "auth_tokens", :force => true do |t|
      add_index "auth_tokens", ["token", "user_id"], :name => "index_auth_tokens_on_user_id_and_token", :unique => true
    

    Rails 2.3.2.1

    PostgreSQL 8.3.7

  • Repository

    Repository April 21st, 2009 @ 11:51 AM

    • State changed from “new” to “resolved”

    (from [f3ac4f387dd748db349ec217030d2b13260516f1]) Fixed dumping from postgresql columns in index in wrong order. [#2515 state:resolved]

    Signed-off-by: Tarmo Tänav tarmo@itech.ee http://github.com/rails/rails/co...

  • Repository

    Repository April 21st, 2009 @ 11:52 AM

    (from [5a4603fafbf4d959ebc8f9435457eaf6fa4004e9]) Fixed dumping from postgresql columns in index in wrong order. [#2515 state:resolved]

    Signed-off-by: Tarmo Tänav tarmo@itech.ee http://github.com/rails/rails/co...

  • Kevin Menard

    Kevin Menard April 24th, 2009 @ 10:40 PM

    It may be safer to collect up the attnum and sort by indkey. While IN appears to a left-to-right comparison, short-circuiting on true, I'm not sure that this isn't an implementation artifact. So, the fix as written may itself introduce a subtle ordering flaw. Having said that, I don't think I've ever seen an RDBMS implement it any other way even if it technically could.

  • Michael Schuerig

    Michael Schuerig April 25th, 2009 @ 09:51 AM

    Please consider using the patch I attached to #2400. I does sort by attribute order.

  • Tarmo Tänav

    Tarmo Tänav April 25th, 2009 @ 03:15 PM

    Could you explain the problem in more detail? The way I see it, it looks like indkey defines the column order, and column_names is sorted exactly the same way as indkey as it's just a .map of indkey, so what does the IN in the SQL have to do with any ordering?

  • Michael Schuerig

    Michael Schuerig April 25th, 2009 @ 06:33 PM

    Tarmo, your version gets the order right, but at the price of two queries and parsing in ruby. You can have it all in one query:

    
    SELECT i.relname, d.indisunique, a.attname
    FROM pg_class t, pg_class i, pg_index d, pg_attribute a,
    generate_series(0, 9) AS c(i)
    WHERE i.relkind = 'i'
    AND d.indexrelid = i.oid
    AND d.indisprimary = 'f'
    AND t.oid = d.indrelid
    AND t.relname = '{table_name}'
    AND i.relnamespace IN (SELECT oid FROM pg_namespace WHERE nspname IN (#{schemas}) )
    AND a.attrelid = t.oid
    AND d.indkey[c.i]=a.attnum
    ORDER BY i.relname, c.i
    
  • Tarmo Tänav

    Tarmo Tänav April 25th, 2009 @ 06:48 PM

    Though I do not have postgresql 7.4 to test and don't really know how well rails currently works with it, I do believe that generate_series is not by default present in 7.4. The overhead of the two queries is unlikely to be significant (especially in production mode where it matters since they will be executed only once).

  • Kevin Menard

    Kevin Menard April 26th, 2009 @ 08:51 PM

    Tarmo,

    You are correct. I missed the mapping step afterwards and thought that the patch was attempting to fix this through SQL only. It would be hard to catch because I believe if you took the mapping portion out the test would still pass as an implementation detail. Anyway, that's neither here nor there. Thanks for setting the record straight.

  • Kevin Menard

    Kevin Menard April 26th, 2009 @ 08:54 PM

    Michael,

    Another issue with using generate_series is that it reintroduces a problem with the original code that the applied patch fixes. Namely, you're imposing a hard 10 column limit on the size of the index. Going back at least as far as PostgreSQL 7.4, the default DB limit is 32 and could be compiled higher. The applied patch removes this artificial limitation.

  • Michael Schuerig

    Michael Schuerig April 27th, 2009 @ 12:57 PM

    Tarmo, Kevin: I agree, it's better not to hard code the number of index columns.

    Have a look at the attached patch that gets rid of the second explicit SQL statement by reusing a slightly extended #column_definitions (it interprets an integer table name as an oid).

    I'm thinking of extending the PostgreSQL adapter for expression indexes and partial indexes. To make that easier, I'd like the code to be as free of redundancies as possible.

  • Michael Schuerig

    Michael Schuerig April 27th, 2009 @ 01:29 PM

    That's what I get for not running the tests.

  • Michael Schuerig

    Michael Schuerig April 27th, 2009 @ 01:32 PM

    And finally without debug statements. Sorry for the mess.

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile »

<h2 style="font-size: 14px">Tickets have moved to Github</h2>

The new ticket tracker is available at <a href="https://github.com/rails/rails/issues">https://github.com/rails/rails/issues</a>

Referenced by

Pages