This project is archived and is in readonly mode.

#5924 open
Lawrence Pit

mysql collation_connection gets wrong value

Reported by Lawrence Pit | November 6th, 2010 @ 04:41 AM

The create_database method of databases.rake creates a mysql database with default charset utf8 and default collation of utf8_unicode_ci. These values can be overridden by providing a :charset resp. :collation key/values in database.yml.

When connecting to the database though by default it will use these values:

 {"Variable_name"=>"character_set_client", "Value"=>"latin1"},
 {"Variable_name"=>"character_set_connection", "Value"=>"latin1"},
 {"Variable_name"=>"character_set_database", "Value"=>"utf8"},
 {"Variable_name"=>"character_set_results", "Value"=>"latin1"},
 {"Variable_name"=>"character_set_server", "Value"=>"utf8"},
 {"Variable_name"=>"character_set_system", "Value"=>"utf8"},
 {"Variable_name"=>"collation_connection", "Value"=>"latin1_swedish_ci"}
 {"Variable_name"=>"collation_database", "Value"=>"utf8_unicode_ci"},
 {"Variable_name"=>"collation_server", "Value"=>"utf8_unicode_ci"}

The only way to get some control over this by explicitly defining encoding:utf8 in database.yml. You'd then get:

 {"Variable_name"=>"character_set_client", "Value"=>"utf8"},
 {"Variable_name"=>"character_set_connection", "Value"=>"utf8"},
 {"Variable_name"=>"character_set_database", "Value"=>"utf8"},
 {"Variable_name"=>"character_set_results", "Value"=>"utf8"},
 {"Variable_name"=>"character_set_server", "Value"=>"utf8"},
 {"Variable_name"=>"character_set_system", "Value"=>"utf8"},
 {"Variable_name"=>"collation_connection", "Value"=>"utf8_general_ci"}
 {"Variable_name"=>"collation_database", "Value"=>"utf8_unicode_ci"}, 
 {"Variable_name"=>"collation_server", "Value"=>"utf8_unicode_ci"}

I.e., the collation_connection value is still using a wrong value in this case (it's value is always the default collation for the character set of the connection, in this case of the utf8 character set).

The only solution I see is by patching the +configure_connection+ method of the mysql adapter.

There are two options:

  1. Instead of doing SET NAMES 'utf8' it should be possible to do SET NAMES 'utf8' COLLATE 'utf8_unicode_ci'. For this I propose the +configure_connection+ method should (re)use the :charset and :collation values from database.yml (in favor of :encoding), just like the databases rake tasks do.

  2. Instead of doing SET NAMES 'utf8' it should do SET CHARACTER SET 'utf8'. In that case the collation_connection value will be set to the value of collation_database.

Which option is preferable?

Comments and changes to this ticket

  • Santiago Pastorino

    Santiago Pastorino February 6th, 2011 @ 09:01 PM

    • State changed from “new” to “open”

    This issue has been automatically marked as stale because it has not been commented on for at least three months.

    The resources of the Rails core team are limited, and so we are asking for your help. If you can still reproduce this error on the 3-0-stable branch or on master, please reply with all of the information you have about it and add "[state:open]" to your comment. This will reopen the ticket for review. Likewise, if you feel that this is a very important feature for Rails to include, please reply with your explanation so we can consider it.

    Thank you for all your contributions, and we hope you will understand this step to focus our efforts where they are most helpful.

  • Santiago Pastorino

    Santiago Pastorino February 6th, 2011 @ 09:01 PM

    • State changed from “open” to “stale”
  • Lawrence Pit

    Lawrence Pit February 6th, 2011 @ 10:46 PM

    • State changed from “stale” to “open”

    [state:open] must be fixed.

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile »

<h2 style="font-size: 14px">Tickets have moved to Github</h2>

The new ticket tracker is available at <a href="https://github.com/rails/rails/issues">https://github.com/rails/rails/issues</a>

People watching this ticket

Pages