This project is archived and is in readonly mode.

#1279 ✓invalid
onur güngör

2.1.0 associations cause memory leak?

Reported by onur güngör | October 27th, 2008 @ 01:31 PM | in 2.x

Hi,

This is my first bug report so I could be missing something, so please forgive me.

I experience a memory leak when I run the following Rake task. It continues to run for a long time and consumes the memory very fast and as a result slows down.

I suspect that the sentences association (has_and_belongs_to_many) in word.rb is responsible because when I replace sentences_count = word.sentences.size with sentences_count = 0 memory leak is gone.

I found out _why's advice and think this maybe related with it, http://whytheluckystiff.net/arti...

Additionally, I used the following simple memory profiler http://scottstuff.net/blog/artic...

I will attach the output of this profiler for the two different configurations, one with associations and the other with no associations.


desc 'how many times a token set is used.'
task :count_each_word_occurence_roughly => :environment do

  word_count = 0

  words = Word.find(:all, :order => "id")
  words.each do |word|
    parse_tree = word.parse_tree

    parse_tree.root_node.each do |node|
      if node.children.length > 1
        tokens = []
        node.children.each do |child|
          token = Token.find_or_create_by_text(child.name)
          tokens << token.id
        end
        tokens.sort!
        token_ids = tokens.join("-")
        token_set = TokenSet.find_or_initialize_by_token_ids(token_ids)

        sentences_count = word.sentences.size
        if token_set.roughtotal
          token_set.roughtotal += sentences_count
        else
          token_set.roughtotal = sentences_count
        end

        token_set.save
      end
    end

    word_count += 1
    if word_count % 10000 == 0
      print "processed " + word_count.to_s + " words\n"
      GC.start
    end
    # break unless word_count < 2000
  end
end

class Word < ActiveRecord::Base

  #  has_many :word_parses
  #  has_many :parses, :through => :word_parses
  has_and_belongs_to_many :parses, :join_table => "word_parses"

  has_and_belongs_to_many :features, :join_table => "words_features", :uniq => true

  has_and_belongs_to_many :sentences, :join_table => "words_sentences", :uniq => true

  has_many :spellings

  ...


class Sentence < ActiveRecord::Base

  has_and_belongs_to_many :words, :join_table => "words_sentences", :uniq => true
end

schema.rb


ActiveRecord::Schema.define(:version => 5) do

  create_table "features", :force => true do |t|
    t.text "description"
  end

  create_table "parses", :force => true do |t|
    t.text "parse_text", :null => false
  end

  add_index "parses", ["parse_text"], :name => "parse_text"

  create_table "sentences", :force => true do |t|
    t.text "spelling_sequence",      :null => false
    t.text "correct_parse_sequence", :null => false
  end

  create_table "spellings", :force => true do |t|
    t.integer "word_id", :null => false
    t.text    "text",    :null => false
  end

  add_index "spellings", ["word_id"], :name => "word_id"
  add_index "spellings", ["text"], :name => "text"

  create_table "token_sets", :force => true do |t|
    t.integer  "count"
    t.datetime "created_at"
    t.datetime "updated_at"
    t.text     "token_ids"
  end

  add_index "token_sets", ["token_ids"], :name => "token_ids"

  create_table "tokens", :force => true do |t|
    t.text     "text"
    t.datetime "created_at"
    t.datetime "updated_at"
  end

  add_index "tokens", ["text"], :name => "text"

  create_table "word_parses", :force => true do |t|
    t.integer "word_id",  :null => false
    t.integer "parse_id", :null => false
  end

  add_index "word_parses", ["word_id"], :name => "word_id"
  add_index "word_parses", ["parse_id"], :name => "parse_id"

  create_table "words", :force => true do |t|
    t.string "name", :limit => 100, :null => false
  end

  add_index "words", ["name"], :name => "name"

  create_table "words_features", :id => false, :force => true do |t|
    t.integer "word_id",    :null => false
    t.integer "feature_id", :null => false
  end

  add_index "words_features", ["word_id"], :name => "word_id"
  add_index "words_features", ["feature_id"], :name => "feature_id"

  create_table "words_sentences", :id => false, :force => true do |t|
    t.integer "word_id",     :null => false
    t.integer "sentence_id", :null => false
  end

  add_index "words_sentences", ["word_id"], :name => "word_id"
  add_index "words_sentences", ["sentence_id"], :name => "sentence_id"

end

Comments and changes to this ticket

  • onur güngör

    onur güngör October 27th, 2008 @ 01:32 PM

    addind the log of the version with no associations...

  • Frederick Cheung

    Frederick Cheung December 12th, 2008 @ 04:48 PM

    • State changed from “new” to “invalid”

    So word.sentences.size will cause the association to be loaded into memory inside a instance variable in word. Ruby can't garbage collect the array holding the sentences because the word still refers to it, and ruby can't garbage collect words you have already processed because the words array still refers to them.

    One way of dodging this is to use count instead of size (since that won't cause the target to be loaded). Another way would be to split words into batches (so that when you have finished one batch nothing refers to those words any more and it can be garbage collected, yet another way would be change the find call to load records (for example) 500 at a time.

    Nothing bad that rails is doing :-)

  • onur güngör

    onur güngör December 12th, 2008 @ 08:45 PM

    thanks for the notice, I already solved the problem by using batches.

    I will follow your recommendation of using count() instead at the earlies convenience.

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile »

<h2 style="font-size: 14px">Tickets have moved to Github</h2>

The new ticket tracker is available at <a href="https://github.com/rails/rails/issues">https://github.com/rails/rails/issues</a>

People watching this ticket

Pages