This project is archived and is in readonly mode.
2.1.0 associations cause memory leak?
Reported by onur güngör | October 27th, 2008 @ 01:31 PM | in 2.x
Hi,
This is my first bug report so I could be missing something, so please forgive me.
I experience a memory leak when I run the following Rake task. It continues to run for a long time and consumes the memory very fast and as a result slows down.
I suspect that the sentences association (has_and_belongs_to_many) in word.rb is responsible because when I replace sentences_count = word.sentences.size with sentences_count = 0 memory leak is gone.
I found out _why's advice and think this maybe related with it, http://whytheluckystiff.net/arti...
Additionally, I used the following simple memory profiler http://scottstuff.net/blog/artic...
I will attach the output of this profiler for the two different configurations, one with associations and the other with no associations.
desc 'how many times a token set is used.'
task :count_each_word_occurence_roughly => :environment do
word_count = 0
words = Word.find(:all, :order => "id")
words.each do |word|
parse_tree = word.parse_tree
parse_tree.root_node.each do |node|
if node.children.length > 1
tokens = []
node.children.each do |child|
token = Token.find_or_create_by_text(child.name)
tokens << token.id
end
tokens.sort!
token_ids = tokens.join("-")
token_set = TokenSet.find_or_initialize_by_token_ids(token_ids)
sentences_count = word.sentences.size
if token_set.roughtotal
token_set.roughtotal += sentences_count
else
token_set.roughtotal = sentences_count
end
token_set.save
end
end
word_count += 1
if word_count % 10000 == 0
print "processed " + word_count.to_s + " words\n"
GC.start
end
# break unless word_count < 2000
end
end
class Word < ActiveRecord::Base
# has_many :word_parses
# has_many :parses, :through => :word_parses
has_and_belongs_to_many :parses, :join_table => "word_parses"
has_and_belongs_to_many :features, :join_table => "words_features", :uniq => true
has_and_belongs_to_many :sentences, :join_table => "words_sentences", :uniq => true
has_many :spellings
...
class Sentence < ActiveRecord::Base
has_and_belongs_to_many :words, :join_table => "words_sentences", :uniq => true
end
schema.rb
ActiveRecord::Schema.define(:version => 5) do
create_table "features", :force => true do |t|
t.text "description"
end
create_table "parses", :force => true do |t|
t.text "parse_text", :null => false
end
add_index "parses", ["parse_text"], :name => "parse_text"
create_table "sentences", :force => true do |t|
t.text "spelling_sequence", :null => false
t.text "correct_parse_sequence", :null => false
end
create_table "spellings", :force => true do |t|
t.integer "word_id", :null => false
t.text "text", :null => false
end
add_index "spellings", ["word_id"], :name => "word_id"
add_index "spellings", ["text"], :name => "text"
create_table "token_sets", :force => true do |t|
t.integer "count"
t.datetime "created_at"
t.datetime "updated_at"
t.text "token_ids"
end
add_index "token_sets", ["token_ids"], :name => "token_ids"
create_table "tokens", :force => true do |t|
t.text "text"
t.datetime "created_at"
t.datetime "updated_at"
end
add_index "tokens", ["text"], :name => "text"
create_table "word_parses", :force => true do |t|
t.integer "word_id", :null => false
t.integer "parse_id", :null => false
end
add_index "word_parses", ["word_id"], :name => "word_id"
add_index "word_parses", ["parse_id"], :name => "parse_id"
create_table "words", :force => true do |t|
t.string "name", :limit => 100, :null => false
end
add_index "words", ["name"], :name => "name"
create_table "words_features", :id => false, :force => true do |t|
t.integer "word_id", :null => false
t.integer "feature_id", :null => false
end
add_index "words_features", ["word_id"], :name => "word_id"
add_index "words_features", ["feature_id"], :name => "feature_id"
create_table "words_sentences", :id => false, :force => true do |t|
t.integer "word_id", :null => false
t.integer "sentence_id", :null => false
end
add_index "words_sentences", ["word_id"], :name => "word_id"
add_index "words_sentences", ["sentence_id"], :name => "sentence_id"
end
Comments and changes to this ticket
-
Frederick Cheung December 12th, 2008 @ 04:48 PM
- State changed from new to invalid
So word.sentences.size will cause the association to be loaded into memory inside a instance variable in word. Ruby can't garbage collect the array holding the sentences because the word still refers to it, and ruby can't garbage collect words you have already processed because the words array still refers to them.
One way of dodging this is to use count instead of size (since that won't cause the target to be loaded). Another way would be to split words into batches (so that when you have finished one batch nothing refers to those words any more and it can be garbage collected, yet another way would be change the find call to load records (for example) 500 at a time.
Nothing bad that rails is doing :-)
-
onur güngör December 12th, 2008 @ 08:45 PM
thanks for the notice, I already solved the problem by using batches.
I will follow your recommendation of using count() instead at the earlies convenience.
Create your profile
Help contribute to this project by taking a few moments to create your personal profile. Create your profile »
<h2 style="font-size: 14px">Tickets have moved to Github</h2>
The new ticket tracker is available at <a href="https://github.com/rails/rails/issues">https://github.com/rails/rails/issues</a>