This project is archived and is in readonly mode.

#3460 ✓stale
Maxim Kulkin

find_each performance improvement

Reported by Maxim Kulkin | November 5th, 2009 @ 09:41 AM

Right now find_each splits the result set by ordering by primary key, using limit and condition on primary key. So, while iterating over the records in result set, there will be (RESULT_SET_SIZE + BATCH_SIZE - 1) / BATCH_SIZE requests to database. This can be improved to only one request.

The database adapters support #select(sql, name) method that returns an array of all results in a hash format (each item in array is a Hash of attributes from corresponding row of result set). Most often that method gets a result object from underlying database driver and iterates over result rows. This iteration can be exposed to upper levels for access optimization.

Proposed solution:

Implement a #select_each(sql, name) method for database adapters that will read rows from the database, convert them to Hashes (if they not) and just yield those Hashes. Rewrite #select method to make use of #select_each.

Implement ActiveRecord's #find_each to check if underlying database adapter supports #select_each (to be sure not to break compatibility with third-party database adapters) and if so call #select_each, get the yielded record attributes, instantiate an object and yield:

def find_each(...)
  # construct the sql

  connection.select_each(sql) do |attributes|
    yield instantiate(attributes)
  end
end

Benefits:

  1. Only one request to database
  2. No limitations on the selected values (current implementation requires primary key to be included in result)
  3. No limitations on ordering records (current implementation forces order on primary key)
  4. Memory efficient

Comments and changes to this ticket

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile »

<h2 style="font-size: 14px">Tickets have moved to Github</h2>

The new ticket tracker is available at <a href="https://github.com/rails/rails/issues">https://github.com/rails/rails/issues</a>

Pages