This project is archived and is in readonly mode.

#1738 ✓resolved
Luke Ludwig

connection.columns doesn't get cached

Reported by Luke Ludwig | January 12th, 2009 @ 03:50 PM | in 2.x

This patch caches columns for has_and_belongs_to_many associations. The cache is stored in a hash (with table name as the key) within one of the owners of the habtm association. The connection.columns method ("SHOW FIELDS" query) is called when a find or an insert is done through the habtm association and can be expensive depending on the database configuration.

After applying this patch on Team Sport Tech's app we saw a noticeable performance improvement. See details at http://lukeludwig.com/blog/2009/...

If someone could take a look at this patch and let me know what they think that would be great,

Luke Ludwig

Comments and changes to this ticket

  • Matt Jones

    Matt Jones January 14th, 2009 @ 04:53 PM

    • Title changed from “Caching columns for has_and_belongs_to_many” to “connection.columns doesn't get cached”

    There's something more complicated going on here; the return value for columns is supposed to be cached. (in query_cache.rb)

    It looks like this is a clash between alias_method_chain in query_cache.rb and the subclassing in each individual adapter; the cache method never gets called... It just happens that AR caches the value for regular objects itself (line 1218, base.rb).

    This problem needs a patch that addresses the general problem, not just for habtm.

  • Michael Koziarski

    Michael Koziarski January 15th, 2009 @ 08:54 PM

    • Assigned user set to “Michael Koziarski”

    This is primarily an issue with HABTM though right? As the other associations don't look at their columns every time you try to use them?

    Luke, perhaps the best bet here is to simply cache the number of columns in the join table when the association is first used? So have this live in the association instance not the model class?

  • Luke Ludwig

    Luke Ludwig January 16th, 2009 @ 03:34 AM

    Matt - Doesn't the query_cache only persist for the life of the request? We want the habtm columns to be cached for the life of the process, (or until reset_column_information is called), similar to how ActiveRecord::Base caches columns. It seems that the alias method chaining of columns in query_cache should just be removed... even if it was working it is redundant unless someone calls connection.columns directly from their code.

    Michael - By caching the columns in the "association instance" I assume you mean within the reflection object. This seems to work as intended (caches columns for life of process) and is cleaner then my original solution of putting the cache in the owner class.

    module ActiveRecord module Reflection

    class AssociationReflection < MacroReflection
    
      def columns(join_table_name)
        unless defined?(@columns) && @columns
          @columns = klass.connection.columns(join_table_name, "#{join_table_name} Join Table Columns")
        end
        @columns
      end
    
    

    I'll get a new patch together using this approach.

    Luke

  • Michael Koziarski

    Michael Koziarski January 16th, 2009 @ 03:56 AM

    That's what I had in mind luke, just make sure it forgets the info when you call reset_column_information, and I'll be happy to apply the patch.

  • Luke Ludwig

    Luke Ludwig January 16th, 2009 @ 07:15 PM

    Here is the patch file. The columns are now cached within the reflection instance. You can now call columns and reset_column_information on the the habtm association directly, like this:

    developer = Developer.find(1)
    devloper.projects.columns
    developer.projects.reset_column_information
    
    
  • Michael Koziarski

    Michael Koziarski January 16th, 2009 @ 10:09 PM

    
    +        unless defined?(@columns) && @columns
    +          @columns = klass.connection.columns(tbl_name, log_msg)
    +        end
    +        @columns
    

    This is synonymous with @columns ||= klass...

    Is there any reason you have the specific defined? test in there?

  • Luke Ludwig
  • Repository

    Repository January 17th, 2009 @ 05:09 AM

    • State changed from “new” to “committed”

    (from [3ee4e009185173aab78f6503ee45e3ef4482874e]) Cache columns for has_and_belongs_to_many associations

    This avoids repeatedly calling SHOW COLUMNS when the association is queried [#1738 state:committed] http://github.com/rails/rails/co...

  • Repository

    Repository January 17th, 2009 @ 05:11 AM

    (from [7c147e94e66530a8314bc5d836df412fd749d55b]) Cache columns for has_and_belongs_to_many associations

    This avoids repeatedly calling SHOW COLUMNS when the association is queried [#1738 state:committed] http://github.com/rails/rails/co...

  • CancelProfileIsBroken
  • Luke Ludwig

    Luke Ludwig January 17th, 2009 @ 04:06 PM

    Looks like the sqlite adapter doesn't make any queries when calling connection.columns, at least from the perspective of assert_queries, which causes the test_caching_of_columns test to fail. I added an 'if current_adapter?(:MysqlAdapter)' around the test, since the only adapter that I know this test works with for sure is the MysqlAdapter.

  • CancelProfileIsBroken

    CancelProfileIsBroken January 17th, 2009 @ 04:39 PM

    Checking the build logs, we're seeing pass on MySQL, fail on PostgreSQL, SQLite2, SQLite3, before this patch.

    All tests pass after applying this patch, but I'm -1 on it anyhow. If the problem is that the tests aren't testing the right thing, we should fix the tests. If the problem is that the caching works differently depending on adapter, we should have the check in the caching code, not in the test code. This approach is just hiding the issue.

  • Luke Ludwig

    Luke Ludwig January 17th, 2009 @ 06:11 PM

    The caching works the same for all adapters, so this is just a test issue. The caching of the habtm columns is very similar in regards to the caching of columns in ActiveRecord::Base. As far as I can tell there aren't any tests to verify that the caching works in ActiveRecord::Base.columns, so this test that I've added is better than nothing! I can look at getting this test to work for the other adapters, and at adding similar tests for ActiveRecord::Base.columns.

  • Matt Jones

    Matt Jones January 19th, 2009 @ 05:27 AM

    @Luke - indeed, there isn't a test. The current code in query_cache.rb doesn't get called at all.

    I've got an idea how to rewrite it transparently (using some of the .extend stuff wycats wrote about yesterday) - will post a patch in the next few days.

  • Luke Ludwig

    Luke Ludwig January 20th, 2009 @ 04:25 AM

    Attached is a patch that fixes test_caching_of_columns to work for all database adapters. I did an alias_method_chain on connection.columns to count the number of times it is called, and then defined an assert_columns method which I use instead of assert_queries. I also added a test_caching_of_columns to base_test.rb.

    Note that these tests do not test the query cache in any way.

  • Michael Koziarski

    Michael Koziarski January 20th, 2009 @ 10:32 AM

    • State changed from “committed” to “open”
    • Assigned user changed from “Michael Koziarski” to “Pratik”
  • Pratik

    Pratik March 14th, 2009 @ 04:17 PM

    • State changed from “open” to “resolved”

    Is this still a problem ? If so, could you please open a new ticket and assign to me.

    Thanks

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile »

<h2 style="font-size: 14px">Tickets have moved to Github</h2>

The new ticket tracker is available at <a href="https://github.com/rails/rails/issues">https://github.com/rails/rails/issues</a>

Referenced by

Pages