This project is archived and is in readonly mode.

#4304 ✓resolved
Doug Richardson

Cannot set postgres adapter encoding to anything besides utf-8

Reported by Doug Richardson | March 31st, 2010 @ 04:30 PM

In Rails 2.3.5, the postgres adapter ignores the encoding field. If you try to set the encoding to sql_ascii, for example, the adapter ignores that setting instead uses the default of utf-8.

Another problem is that rails generates a postgresql database.yaml file with encoding set to unicode, which is not a valid postgresql encoding (although rails could treat this as an alias to utf-8).

Valid postgresql encodings are defined here: http://www.postgresql.org/docs/8.4/static/multibyte.html

Comments and changes to this ticket

  • Doug Richardson

    Doug Richardson March 31st, 2010 @ 05:32 PM

    I have a patch for this and will submit it after I read the contributors guide and gem my environment configured to run the "rake test_postgresql" on rails 3.

  • Yehuda Katz (wycats)

    Yehuda Katz (wycats) March 31st, 2010 @ 07:49 PM

    • Assigned user set to “Rizwan Reza”

    @doug is this bug also in 3.0?

  • Rizwan Reza

    Rizwan Reza March 31st, 2010 @ 08:18 PM

    Please let us know if you run into problems making a patch etc. You can find me on the #rails-contrib channel on freenode IRC. Thanks.

  • Doug Richardson

    Doug Richardson March 31st, 2010 @ 09:28 PM

    • Assigned user cleared.

    I'm not sure how to add tests for databases.rake so I will do this later. Here's the patch in case anyone needs it now (this was a patch to 2.3.5):

    Index: server/vendor/rails/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb

    --- server/vendor/rails/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb (revision 94349) +++ server/vendor/rails/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb (revision 94350) @@ -609,11 +609,16 @@

     #
     # Example:
     #   create_database config[:database], config
    
    • # create_database 'foo_development', :encoding => 'unicode'
    • # create_database 'foo_development', :encoding => 'utf8' def create_database(name, options = {})
    •  options = options.reverse_merge(:encoding => "utf8")
      
    •  options = options.symbolize_keys.reverse_merge(:encoding => "utf8")
      
    •  # Rails database.yml config files used to be generated with the invalid "unicode" encoding.
      
    •  if options[:encoding] == "unicode"
      
    •    options[:encoding] = "utf8"
      
    •  end
      
    •  option_string = options.symbolize_keys.sum do |key, value|
      
    •  option_string = options.sum do |key, value|
       case key
       when :owner
         " OWNER = "#{value}""
      

    Index: server/vendor/rails/railties/lib/tasks/databases.rake

    --- server/vendor/rails/railties/lib/tasks/databases.rake (revision 94349) +++ server/vendor/rails/railties/lib/tasks/databases.rake (revision 94350) @@ -64,7 +64,7 @@

         $stderr.puts "Couldn't create database for #{config.inspect}, charset: #{config['charset'] || @charset}, collation: #{config['collation'] || @collation} (if you set the charset manually, make sure you have a matching collation)"
       end
     when 'postgresql'
    
    •  @encoding = config[:encoding] || ENV['CHARSET'] || 'utf8'
      
    •  @encoding = config['encoding'] || ENV['CHARSET'] || 'utf8'
      
      begin
       ActiveRecord::Base.establish_connection(config.merge('database' => 'postgres', 'schema_search_path' => 'public'))
       ActiveRecord::Base.connection.create_database(config['database'], config.merge('encoding' => @encoding))
      
  • Doug Richardson

    Doug Richardson March 31st, 2010 @ 09:31 PM

    OMG, I'm such a formatting n00b. Sorry about that.

    @Yehuda, yes it looks like the bug is also in 3.0 and I was actually going to make the fix there.

    @Rizwan I'll probably need your help figuring out how to setup a unit test for databases.rake. I wasn't able to find them. I'll contact you on the IRC channel.

  • Doug Richardson

    Doug Richardson March 31st, 2010 @ 09:32 PM

    • Assigned user set to “Rizwan Reza”

    Another n00bism, I accidentally cleared the assignment so resetting back to Rizwan.

  • Doug Richardson

    Doug Richardson April 1st, 2010 @ 06:09 AM

    I can reproduce this problem in rails 3.0 as well. Since I left out the reproduction steps in the problem description, here they are:

    1. Create a postgres database cluster with default encoding set to something besides UTF8 (e.g. sql_ascii):
      initdb --encoding=sql_ascii .
      
    2. Modify database.yml to set the encoding to sql_ascii
    3. rake db:create

    The result is the following error.

    DougMacBookPro:doug doug$ rake db:create
    (in /Users/doug/test/doug)
    PGError: ERROR:  new encoding (UTF8) is incompatible with the encoding of the template database (SQL_ASCII)
    HINT:  Use the same encoding as in the template database, or use template0 as template.
    : CREATE DATABASE "doug_development" ENCODING = 'utf8'
    /Library/Ruby/Gems/1.8/gems/activerecord-3.0.0.beta1/lib/active_record/connection_adapters/abstract_adapter.rb:207:in log'
    /Library/Ruby/Gems/1.8/gems/activerecord-3.0.0.beta1/lib/active_record/connection_adapters/postgresql_adapter.rb:559:inexecute'
    /Library/Ruby/Gems/1.8/gems/activerecord-3.0.0.beta1/lib/active_record/connection_adapters/postgresql_adapter.rb:642:in create_database'
    /Library/Ruby/Gems/1.8/gems/activerecord-3.0.0.beta1/lib/active_record/railties/databases.rake:100:increate_database'
    /Library/Ruby/Gems/1.8/gems/activerecord-3.0.0.beta1/lib/active_record/railties/databases.rake:35
    /Users/doug/.gem/ruby/1.8/gems/rake-0.8.7/lib/rake.rb:636:in call'
    /Users/doug/.gem/ruby/1.8/gems/rake-0.8.7/lib/rake.rb:636:inexecute'
    /Users/doug/.gem/ruby/1.8/gems/rake-0.8.7/lib/rake.rb:631:in each'
    /Users/doug/.gem/ruby/1.8/gems/rake-0.8.7/lib/rake.rb:631:inexecute'
    /Users/doug/.gem/ruby/1.8/gems/rake-0.8.7/lib/rake.rb:597:in invoke_with_call_chain'
    /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/monitor.rb:242:insynchronize'
    /Users/doug/.gem/ruby/1.8/gems/rake-0.8.7/lib/rake.rb:590:in invoke_with_call_chain'
    /Users/doug/.gem/ruby/1.8/gems/rake-0.8.7/lib/rake.rb:583:ininvoke'
    /Users/doug/.gem/ruby/1.8/gems/rake-0.8.7/lib/rake.rb:2051:in invoke_task'
    /Users/doug/.gem/ruby/1.8/gems/rake-0.8.7/lib/rake.rb:2029:intop_level'
    /Users/doug/.gem/ruby/1.8/gems/rake-0.8.7/lib/rake.rb:2029:in each'
    /Users/doug/.gem/ruby/1.8/gems/rake-0.8.7/lib/rake.rb:2029:intop_level'
    /Users/doug/.gem/ruby/1.8/gems/rake-0.8.7/lib/rake.rb:2068:in standard_exception_handling'
    /Users/doug/.gem/ruby/1.8/gems/rake-0.8.7/lib/rake.rb:2023:intop_level'
    /Users/doug/.gem/ruby/1.8/gems/rake-0.8.7/lib/rake.rb:2001:in run'
    /Users/doug/.gem/ruby/1.8/gems/rake-0.8.7/lib/rake.rb:2068:instandard_exception_handling'
    /Users/doug/.gem/ruby/1.8/gems/rake-0.8.7/lib/rake.rb:1998:in run'
    /Users/doug/.gem/ruby/1.8/gems/rake-0.8.7/bin/rake:31
    /usr/bin/rake:19:inload'
    /usr/bin/rake:19
    Couldn't create database for {"encoding"=>"sql_ascii", "username"=>"doug", "adapter"=>"postgresql", "database"=>"doug_development", "pool"=>5, "password"=>nil}
    

    The database.rake file is ignoring the configuration setting because of the following line:

    @encoding = config[:encoding] || ENV['CHARSET'] || 'utf8'
    
    At that point, config is keyed by the string 'encoding' NOT the symbol :encoding. The correct line is:
    @encoding = config['encoding'] || ENV['CHARSET'] || 'utf8'
    

    Originally (see horrible code formatting above) I thought there was an issue in the postgres adapter as well, but after running the unit tests that code looks fine.

  • Doug Richardson

    Doug Richardson April 1st, 2010 @ 07:36 AM

    I looked into this more and it appears UNICODE was a valid character encoding as recent as postgres 8.0 (http://www.postgresql.org/docs/8.0/interactive/multibyte.html).

    Given that, it's not clear what postgres encoding the rails command should use when creating a new application. To generate the correct encoding, rails would have to prompt for the postgres version.

    Note: I have not tested on a postgres 8.0 installation. It's possible it understands the utf8 encoding even though it's not documented. In that case, defaulting to utf8 would be the best option.

  • Doug Richardson

    Doug Richardson April 1st, 2010 @ 07:57 AM

    Postgres 8.4 still honors unicode encoding even though it's not listed in the documentation. That means the simple change from :encoding to 'encoding' should be enough to close this bug. I'm submitting the patch now.

  • Doug Richardson

    Doug Richardson April 1st, 2010 @ 08:04 AM

    • Tag changed from activerecord, postgresql to activerecord, patch, postgresql
  • Repository

    Repository April 1st, 2010 @ 04:37 PM

    • State changed from “new” to “resolved”

    (from [e8292abbcd581f2fdad368fc5760416071b4b67f]) Read postgresql encoding using string key instead of symbol [#4304 state:resolved]

    Signed-off-by: wycats wycats@gmail.com
    http://github.com/rails/rails/commit/e8292abbcd581f2fdad368fc576041...

  • Venkat

    Venkat April 14th, 2010 @ 02:01 PM

    • Assigned user cleared.

    Dear Sir,

    I have csv file which has encode values like !@#$%^&&_+ some special characters.When i try to import csv file into postgresql, i am getting below error

    "ERROR: invalid byte sequence for encoding "UTF8": 0xb0 HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding".
    CONTEXT: COPY testtableinfo, line 208 "

    I do not understand where i am doing wrong.Please anyone can guide me.I am waiting for your great response.

    Thanks and regards,

    Ven

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile »

<h2 style="font-size: 14px">Tickets have moved to Github</h2>

The new ticket tracker is available at <a href="https://github.com/rails/rails/issues">https://github.com/rails/rails/issues</a>

Attachments

Referenced by

Pages