This project is archived and is in readonly mode.
validates_uniqueness_of is horribly inefficient in mysql
Reported by Dave Grijalva | April 16th, 2009 @ 12:09 AM | in 3.x
mysql is case insensitive by default. for case insensitive comparisions, the following is efficient.
WHERE users.username = 'myUsername'
validates_uniqueness_of currently generates
WHERE LOWER(users.username) = BINARY 'myusername'
this is redundant and completely clobbers performance. If you have a unique index on the column (and you should, because validates_uniqueness_of cannot be trusted), you don't get any of the benefit of the index because the system has to transform the column.
Comments and changes to this ticket
-
CancelProfileIsBroken May 17th, 2009 @ 01:54 PM
- State changed from new to incomplete
Please submit a patch if you want to propose a change here. You'd want to look in activerecord/lib/active_record/validations.rb. What's happening is that the code first requests a case-sensitive equality operator, then later notices that the database is case-insensitive but doesn't change equality operators. So you'd need to fix the logic and add a method to the adapter to return a case-insensitive equality operator as well.
-
Phil Ross May 31st, 2009 @ 03:55 PM
- Tag set to activerecord, patch, validate_uniqueness_of
I'm using the attached patch to avoid the full table scan. This adds a :case_sensitive => :db option to validates_uniqueness_of, which causes the normal equality operator to be used. The validation will therefore use the database's case-sensitivity mode.
-
Scott Storck June 16th, 2009 @ 12:45 AM
I am using this patch because I had a lot of problems with the SQL generated by validate_uniqueness_of not matching an existing row which would cause a duplicate row error.
Removing the BINARY keyword from the sql results in the match working correctly.
The data being tested for uniqueness which fails to match has trailing spaces.
Mysql doesn't seem to interested in the trailing space when checking for uniqueness.
The BINARY keyword however causes mysql to do a byte for byte check.I would think it usually would make sense to keep the uniqueness test as close to the test from the underlying database that is used as possible.
I also have never used a database that doesn't use an index to enforce a unique field.
I suspect that other databases might not be able to use the index if a modifying clause like BINARY is used.
It would be interesting to hear from users of the other databases about this.I have applied the patch manually as it doesn't apply to my ActiveRecord version (2.3.2).
-
David Stevenson June 20th, 2009 @ 06:49 PM
I've had this problem before too. It's a tricky situation because it depends on the setup of MySQL (and its case sensitivity, which is different by default depending on your platform). I'm completely in favor of a :db option that puts that responsibility on the person who sets up the DB, rather than on rails itself.
-
chad.ingram (at me) June 29th, 2009 @ 11:05 PM
- Assigned user set to CancelProfileIsBroken
This now has a patch. Assigning so maybe it can move forward since it's no longer 'incomplete' but I can't change that...
-
CancelProfileIsBroken August 2nd, 2009 @ 11:13 AM
- State changed from incomplete to open
- Tag changed from activerecord, patch, validate_uniqueness_of to activerecord, bugmash, patch, validate_uniqueness_of
-
Renaud Kern August 8th, 2009 @ 10:07 PM
The patch doesn't pass. The file activerecord/test/cases/validations/uniqueness_validation_test.rb doesn't exit anymore. Test are now in activerecord/test/cases/validations_test.rb
-
Phil Ross August 8th, 2009 @ 10:47 PM
@Renaud Kern I still see activerecord/test/cases/validations/uniqueness_validation_test.rb in the master branch and my patch still applies cleanly.
There is an activerecord/test/cases/validations_test.rb file, but this doesn't contain any tests for validates_uniqueness_of.
-
Elad Meidar August 8th, 2009 @ 11:06 PM
+1 on idea, -1 on patch. Patch does not apply on 2-3-stable.
-
Hugo Peixoto August 8th, 2009 @ 11:34 PM
+1 on the idea. But the patch does not apply to 2-3-stable. I've attached a patch that ports Phil's code to this branch.
-
Jeremy Kemper August 9th, 2009 @ 09:16 AM
:case_sensitive => :db is a strange option to add to the public API.
Ideas for clarifying, or perhaps sidestepping, this issue?
-
Hugo Peixoto August 9th, 2009 @ 02:16 PM
I agree, Jeremy. But sidestepping doesn't seem like the right thing to do.
How about ":case_sensitive => :use_database_default" ?
-
Dan Croak August 10th, 2009 @ 03:01 AM
-1 on changing case_insensitive to non-boolean.
My understanding is that the problem here is the BINARY conversion for MySQL, correct? When running a 2.3.3 app, I do not see the reported LOWER().
So, the fix needs to simply be:
make MySQL act like SQLite3, for instance, and simply not add the BINARY conversion?
-
Rizwan Reza August 10th, 2009 @ 03:10 AM
-1 on changing case_sensitive. Agreed with Dan.
Hugo: The patch doesn't work anymore.
-
Elad Meidar August 13th, 2009 @ 10:13 PM
+1 on feature, although not exactly fits into the current boolean convention, it's a well required addition, i would not scope it only for mysql, since FTS are a cross RDBMS issue and there is a need to help developers deal with them before it getting to be a problem.
since the patch for 2-3-stable did work as Rizwan said, i took the liberty to fix it.
-
CancelProfileIsBroken August 19th, 2009 @ 12:38 PM
- Assigned user changed from CancelProfileIsBroken to Pratik
-
Pratik August 19th, 2009 @ 12:38 PM
- Assigned user changed from Pratik to Jeremy Kemper
-
Blue Box Jesse September 27th, 2009 @ 01:38 AM
We've seen this issue bite a number of customers, so this is a very valuable commit.
BugMash: +1
I agree that the best naming convention would be :non-boolean but using the same description field.
I've confirmed the patch applies on 2-3-stable.
-
Jordan Brough November 12th, 2009 @ 10:33 PM
I think this is worth fixing either way, but for anyone interested here's a ticket I've added for a way to sidestep the issue -- #3486 (update rails to handle db unique-constraints and omit the validates_uniqueness_of)
-
Ryan Angilly November 17th, 2009 @ 02:52 PM
Just wanted to throw my $0.02 to keep the discussion going. I do think it's a necessary fix, although I'm not crazy about the :case_sensitive => :db option. I feel like the model should reflect whether or not the intent is for case sensitivity or not. Using :case_sensitive => :db leaves it ambiguous, and would force someone to go check the database engine to find out if case sensitivity was actually enabled. I'm thinking maybe an api like:
validates_uniqueness_of :email, :db_case_sensitive => :false
And that option would turn off the LOWER and the BINARY, and let the db do its magic.
-
Ken Miller December 30th, 2009 @ 10:14 PM
I don't at present have a patch, but doesn't this seem like the sort of thing the connection adapter should be able to report on? Either by introspecting the database, or explicitly declared in the connection spec. That would require editing all the connection adapters, but that seems preferable to having to declare it in each model.
-
Rizwan Reza February 12th, 2010 @ 12:46 PM
- Tag changed from activerecord, bugmash, patch, validate_uniqueness_of to activerecord, patch, validate_uniqueness_of
-
Neeraj Singh April 20th, 2010 @ 05:04 PM
In my test index is not used when I use Lower(user.email). Otherwise index is being used. That means BINARY is not the issue as Dan Croak suggested. Please correct me if I am wrong.
I will say that current code is okay. However it should be documented that if users want better performance then they need to do two things.
- set case_sensitive to true
- add before_validation callback to lowercase the model attribute
-
metavida August 12th, 2010 @ 09:19 PM
- Importance changed from to
I wanted to agree with Neeraj that BINARY isn't the problem. On my MySQL 5.0.89 install I get the following results.
# Using the SQL generated by Rails the index is NOT used. EXPLAIN SELECT * FROM `users` WHERE LOWER(login) = BINARY 'test'; +-------------+------+------+------+------+-------------+ | select_type | type | key | ref | rows | Extra | +-------------+------+------+------+------+-------------+ | SIMPLE | ALL | NULL | NULL | 94 | Using where | +-------------+------+------+------+------+-------------+ # Using neither LOWER nor BINARY the index IS used. EXPLAIN SELECT * FROM `users` WHERE login = 'test'; +-------------+------+-------------------+-------+------+-------------+ | select_type | type | key | ref | rows | Extra | +-------------+------+-------------------+-------+------+-------------+ | SIMPLE | ref | users_login_index | const | 2 | Using where | +-------------+------+-------------------+-------+------+-------------+ # Using only BINARY the index IS used, and the comparison IS case sensitive. EXPLAIN SELECT * FROM `users` WHERE login = BINARY 'test'; +-------------+-------+-------------------+------+------+-------------+ | select_type | type | key | ref | rows | Extra | +-------------+-------+-------------------+------+------+-------------+ | SIMPLE | range | users_login_index | NULL | 2 | Using where | +-------------+-------+-------------------+------+------+-------------+
Seems like a bug in the default behavior to me. With :case_sensitive=>true we should be adding BINARY to the conditions to make the assertion work as it claims it does.
-
Grant Hollingworth January 19th, 2011 @ 04:36 PM
- Tag set to patch
Here's a patch I've been using, rebased against the current HEAD. It adds a
case_insensitive_equality_operator
method toActiveRecord::ConnectionAdapters::DatabaseStatements
, following the existingcase_sensitive_equality_operator
.MysqlAdapter
overrides it.There are no tests because it changes performance but not behaviour.
-
Raphael Sofaer February 17th, 2011 @ 05:11 AM
That patch (Grant's patch) looks like it will break case-insensitive comparison in databases with a collation like utf8_bin. I'm definitely having this problem too, though. We spend a huge amount of time in those queries.
-
Joseph Palermo May 14th, 2011 @ 11:07 PM
Pull request with proposed fix is here:
https://github.com/rails/rails/pull/561
Create your profile
Help contribute to this project by taking a few moments to create your personal profile. Create your profile »
<h2 style="font-size: 14px">Tickets have moved to Github</h2>
The new ticket tracker is available at <a href="https://github.com/rails/rails/issues">https://github.com/rails/rails/issues</a>
People watching this ticket
Attachments
Tags
Referenced by
- 1759 validate_uniqueness_of and case insensitive columns ... If we take the patch in #2503, this will be fixed. Closin...
- 3486 Alternative to validates_uniqueness_of using db constraints With this approach you don't need to do a db lookup to ch...