This project is archived and is in readonly mode.

#2188 open
Jonas Nicklas

Encoding error in Ruby1.9 for templates

Reported by Jonas Nicklas | March 9th, 2009 @ 11:44 PM | in 2.3.10

In Ruby 1.9 translating Strings which have non-ascii characters in them does not work for me.

If I have keys like this is my translation file:

"sv":
  test1: blah
  test2: blåh

Calling this works fine:

I18n.translate(:test1)

However, calling this raises an exception:

I18n.translate(:test2)

Here's the error:

ActionView::TemplateError (incompatible character encodings: ASCII-8BIT and UTF-8)

This is the same error as in #2038, I did run this against edge Rails and it looked like the patch from #2038 has been applied, so I am assuming this is a different issue.

Comments and changes to this ticket

  • Jonas Nicklas

    Jonas Nicklas March 18th, 2009 @ 10:19 PM

    • Tag changed from 2.3-rc2, i18n to 2.3-rc2, ruby1.9
    • Title changed from “i18n fails with multibyte Strings in Ruby 1.9 (similar to #2038)” to “Encoding error in Ruby1.9 for templates”

    I figured out now that I18n isn't the culprit. I18n.t returns a UTF-8 string, the issue seems to be that templates by default are ASCII-8BIT encoded, and when a UTF-8 string is used they switch over.

    <%= "å" %><%= "å".encoding %>

    Works, and would return 'åASCII-8BIT'

    <%= "å".force_encoding('utf-8') %>

    Also works. However:

    <%= "å" %><%= "å".force_encoding('utf-8') %>

    Fails with the above mentioned error.

    I have attached a test case that proves the bug.

  • Mauricio Eduardo Szabo

    Mauricio Eduardo Szabo March 26th, 2009 @ 01:36 AM

    I confirm this error on Rails 2.3.2 and Ruby1.9.

    If, for example, I add on one controller: @errors = ["Á", 'Bê']

    On any view, a simple: <%= @errors.inspect %>

    throws the error incompatible character encodings: ASCII-8BIT and UTF-8

  • Hector E. Gomez Morales

    Hector E. Gomez Morales March 27th, 2009 @ 06:58 PM

    • Tag changed from 2.3-rc2, ruby1.9 to 2.3-rc2, patch, ruby1.9

    The problem is erb code in ruby 1.9 distribution. When it compiles the template code it forces a 'ASCII-8bit' encoding, the problem is when the template code has multibyte characters the template code is returned in a 'ASCII-8bit' string and when this string is concat with a 'UTF8' string with multibyte character the exception is raised because the strings between this encodings are only compatible when both only have seven-bit characters.

    This patch is the result of my research for my proposal for end to end encoding support for rails. I am working for a patch for erb to resolve this problem. The included patch is quick hack to force the encoding of the template method code to be utf-8. #1988 is a duplicate of this bug.

  • crazy_bug (at terletzki)

    crazy_bug (at terletzki) April 6th, 2009 @ 01:57 PM

    Hello! We've got the same problem! Only the error occurs when we fetch data from the database. We're using Mysql and Charset is UTF-8, but the Active Record returns ASCII-8BIT. Is it possible to do similar changes to the activerecord as you did to the actionpack? Seems as we're not the only ones with that problem (http://groups.google.com/group/r...). Can somebody help me with this? Thanks!

  • Hector E. Gomez Morales

    Hector E. Gomez Morales April 6th, 2009 @ 03:03 PM

    I will take a look I will post any findings

  • Hector E. Gomez Morales

    Hector E. Gomez Morales April 10th, 2009 @ 04:38 PM

    Hi, sorry to be so late but I got some solutions to this problem please take a look to #2476

  • Mauricio Eduardo Szabo

    Mauricio Eduardo Szabo April 13th, 2009 @ 02:18 PM

    Hector, sorry but this is not my problem. My problem is not when I fetch data from a database, it's on template rendering, as I shown on my previous post. The ERB Workaround, by the way, worked for me.

    (By the way, I use the postgres-pr adapter to fetch data from my database)

  • hkstar

    hkstar April 19th, 2009 @ 11:12 PM

    +1 to Hector's workaround-erb.diff patch, works for me.

  • Portfonica

    Portfonica April 20th, 2009 @ 01:07 AM

    I'm afraid hector's patch doesn't resolve the problem with another template system such as HAML. :-/ So I think this patch isn't useful.

  • Hector E. Gomez Morales

    Hector E. Gomez Morales April 21st, 2009 @ 01:14 AM

    This patch is concerned with erb as the default templating engine, that I think a lot of people use. If you have a particular haml template that presents the same problem can you provide it so I can dig out the proper fix.

  • qoobaa

    qoobaa May 12th, 2009 @ 11:31 PM

    • Tag changed from 2.3-rc2, patch, ruby1.9 to 2.3-rc2, patch, ruby1.9, tested

    I've created the patch that fixes problems described by Jonas Nicklas. Now everything in views is encoded using UTF-8. The bad news are that a lot of things are broken now. Described problems with HAML may be caused by Rack params encoding (ASCII-8BIT), sqlite3-ruby strings encoding (ASCII-8BIT). I've created the ticket in Rack's lighthouse, we need also to fix sqlite3-ruby gem. Does anybody use mysql or pg gems? Are they broken also?

  • Mauricio Eduardo Szabo

    Mauricio Eduardo Szabo May 13th, 2009 @ 04:39 PM

    with templates_using_utf_8_encoding patch, I confirm there are problems with the postgresql gem (even with the postgres-pr gem).

    One more thing, now line errors on templates are wrong (when I have an error on line #14, rails says it's on line #15).

  • Manfred Stienstra

    Manfred Stienstra May 13th, 2009 @ 04:46 PM

    Also, the utf_8_encoding patch assumes that everyone will want to use UTF-8 in their templates, this might not be the case.

  • qoobaa

    qoobaa May 13th, 2009 @ 05:14 PM

    UTF-8 encoding may be changed easily (we can put ome variable there), but we've to provide some configuration for that (in environment.rb?). I've fixed issue in sqlite3-ruby gem (http://github.com/qoobaa/sqlite3-ruby), however it has no UTF-16 support yet (to be done). I've tried to fix pg gem, but I need to read Posgtres documentation first to do it (the version in my repository uses UTF-8 as default encoding). I've also created the ticket on Rack's lighthouse.

  • Portfonica

    Portfonica May 26th, 2009 @ 12:41 PM

    Hector: I don't any solution, I have only a hack. You can put those lines into your environment.rb

    Encoding.default_internal = 'utf-8'
    Encoding.default_external = 'utf-8'

    Oh, hack != solution :)

  • Anton Ageev

    Anton Ageev May 31st, 2009 @ 12:54 PM

    Strings in params[] have ASCII-8BIT encoding too. Is it Rack issue?

  • Anton Ageev

    Anton Ageev May 31st, 2009 @ 01:00 PM

    Hector: I don't any solution, I have only a hack. You can put those lines into your environment.rb

    Encoding.default_internal = 'utf-8'
    Encoding.default_external = 'utf-8'
    

    This doesn't work for me.

  • qoobaa

    qoobaa May 31st, 2009 @ 01:42 PM

    The params ASCII-8BIT encoding is a Rack issue: http://rack.lighthouseapp.com/projects/22435/tickets/48-rackutilsun...

    Changing the default internal and external encoding also doesn't work in my app.

  • Adam S

    Adam S July 22nd, 2009 @ 04:56 PM

    I'm also seeing this issue with Ruby 1.9 and HAML templates.
    It's very annoying and confusing... I think the only solution is for Rails to set a default encoding in environment.rb and then do translation from other encodings...
    Raising errors for every encoding type is silly.
    I basically want my whole app to use utf8, others may want another encoding, then fine just put it in environment.rb.

  • Jérôme

    Jérôme August 16th, 2009 @ 12:44 AM

    It would be just definitely great if rails could avoid us editing all our files containing unicode characters, all ruby files. I feel like getting a regression with ruby1.9 when I have to add a # encoding: utf-8 header to my hundreds of files...

  • Rocco Di Leo

    Rocco Di Leo August 26th, 2009 @ 03:44 AM

    I also would like to see a environment line where one can set the application wide encoding instead of adding magic comments to all files. Also i did not manage to add magic comments to the .erb files, how would this work?

    At least those two possibilities did not work for me:

    <%#= encoding: utf-8 %>
    <%# encoding: utf-8 %>

    -act

  • Jeremy Kemper

    Jeremy Kemper August 26th, 2009 @ 06:58 AM

    You guys aren't saying where you get these Encoding errors. Please include backtraces or, better, failing test cases so we can reproduce.

    UTF-8 is already the default external encoding. The magic comments are only if you want to write a template in a different encoding than the default.

  • Rocco Di Leo

    Rocco Di Leo August 26th, 2009 @ 01:32 PM

    Reproduce the problem by using this process

    rails utf8errors -d mysql
    # add credentials if necessary to config/database.yml
    cd utf8errors
    rake db:create
    script/generate controller utf8errors index
    script/generate model user
    
    # add "t.string :name" to the migration file before next step
    rake db:migrate
    touch app/views/utf8errors/_partial.html.erb
    echo "Multibyte String öäü works here" >> app/views/utf8errors/_partial.html.erb
    echo "Multibyte String öäü works here" >> app/views/utf8errors/_partial.html.erb
    echo "Inserting User with multibyte characters" >> app/views/utf8errors/_partial.html.erb
    echo "<% User.create(:name => 'Multibyte Username öäü') %>" >> app/views/utf8errors/_partial.html.erb
    echo "Multibyte String from database does NOT work now:" >> app/views/utf8errors/_partial.html.erb
    echo "<% User.all.each do |u| %>" >> app/views/utf8errors/_partial.html.erb
    echo "<%= u.name %>" >> app/views/utf8errors/_partial.html.erb
    echo "<% end %>" >> app/views/utf8errors/_partial.html.erb
    echo "<%= render :partial => 'partial' %>" >> app/views/utf8errors/index.html.erb
    

    take care to use ruby 1.9.1 when starting the server

    ./script/server

    => surf to http://127.0.0.1:3000/utf8errors # should display the error

    
    The error does NOT appear when using the workaround patch by hector which adds the line
    source.force_encoding('utf-8') if '1.9'.respond_to?(:force_encoding) to the actionpack/lib/action_view/renderable.rb


    Backtrace:
    ActionView::TemplateError (incompatible character encodings: ASCII-8BIT and UTF-8) on line #7 of app/views/utf8errors/_partial.html.erb:
    4: <% User.create(:name => 'Multibyte Username öäü') %>
    5: Multibyte String from database does NOT work now:
    6: <% User.all.each do |u| %>
    7: <%= u.name %>
    8: <% end %>
    app/views/utf8errors/_partial.html.erb:7:in `concat'
    app/views/utf8errors/_partial.html.erb:7:in `block in _run_erb_app47views47utf8errors47_partial46html46erb_locals_object_partial'
    app/views/utf8errors/_partial.html.erb:6:in `each'
    app/views/utf8errors/_partial.html.erb:6
    app/views/utf8errors/index.html.erb:3
    <internal:prelude>:8:in `synchronize'
    /usr/local/lib/ruby19/1.9.1/webrick/httpserver.rb:111:in `service'
    /usr/local/lib/ruby19/1.9.1/webrick/httpserver.rb:70:in `run'
    /usr/local/lib/ruby19/1.9.1/webrick/server.rb:183:in `block in start_thread'
    

    Rendered rescues/trace (80.9ms)
    Rendered rescues/
    request_and_response (0.9ms)
    Rendering rescues/layout (internal_server_error)

    
    I hope this helps and that the formatting is working...


    -act
  • Rocco Di Leo

    Rocco Di Leo August 26th, 2009 @ 01:35 PM

    okay the formatting is kinda broken but i think you get the idea ... in addition, it should fail with Ruby 1.9 (instead of 1.9.1) as well. Also the problem arises with postgresql too (havent tested sqlite3.

    -act

  • Rocco Di Leo

    Rocco Di Leo August 26th, 2009 @ 02:11 PM

    One more note. I just rechecked this process with postgresql 8.4 and in this case the workaround-erb patch by hector is NOT working. Sorry for the confusion before. So summarized for my Setup:

    Ruby 1.9.x + Rails 2.3.3 + Mysql 5.1 => not working
    Ruby 1.9.x + Rails 2.3.3 + with hector patch + Mysql 5.1 => working
    Ruby 1.9.x + Rails 2.3.3 (with and without hector patch) + Postgresql 8.4 => not working

    Greets
    -act

  • Adam S

    Adam S August 26th, 2009 @ 02:11 PM

    Are you sure this isn't the mysql gem?

    I used the process here: and I can create multibyte users etc. In fact no real issues with multibyte now...

    http://www.taylorluk.com/articles/2009/08/12/ruby-19-and-passenger

    Also used this adapter for sqlite3:

    http://github.com/qoobaa/sqlite3-ruby/tree/master

  • Rocco Di Leo

    Rocco Di Leo August 26th, 2009 @ 03:25 PM

    thank you, i updated from mysql gem 2.7 to the self-built 2.81 .. with the process above i got the error 'uninitialized constant Encoding::UTF' accessing http://localhost:3000/utf8errors using Rails 2.3.3

    when i changed in Rack::utils.rb

    RUBY_VERSION >= "1.9" ? result.force_encoding(Encoding::UTF-8) : result

    for

    RUBY_VERSION >= "1.9" ? result.force_encoding('utf-8') : result

    the rendering worked indeed.

    Here is the output without alteration for the interested:

    
    [2009-08-26 16:19:01] ERROR NameError: uninitialized constant Encoding::UTF
        /usr/local/lib/ruby19/gems/1.9.1/gems/activesupport-2.3.3/lib/active_support/dependencies.rb:105:in `rescue in const_missing'
        /usr/local/lib/ruby19/gems/1.9.1/gems/activesupport-2.3.3/lib/active_support/dependencies.rb:94:in `const_missing'
        /usr/local/lib/ruby19/gems/1.9.1/gems/rack-1.0.0/lib/rack/utils.rb:27:in `unescape'
        /usr/local/lib/ruby19/gems/1.9.1/gems/rails-2.3.3/lib/rails/rack/static.rb:36:in `file_exist?'
        /usr/local/lib/ruby19/gems/1.9.1/gems/rails-2.3.3/lib/rails/rack/static.rb:18:in `call'
        /usr/local/lib/ruby19/gems/1.9.1/gems/rack-1.0.0/lib/rack/urlmap.rb:46:in `block in call'
        /usr/local/lib/ruby19/gems/1.9.1/gems/rack-1.0.0/lib/rack/urlmap.rb:40:in `each'
        /usr/local/lib/ruby19/gems/1.9.1/gems/rack-1.0.0/lib/rack/urlmap.rb:40:in `call'
        /usr/local/lib/ruby19/gems/1.9.1/gems/rails-2.3.3/lib/rails/rack/log_tailer.rb:17:in `call'
        /usr/local/lib/ruby19/gems/1.9.1/gems/rack-1.0.0/lib/rack/content_length.rb:13:in `call'
        /usr/local/lib/ruby19/gems/1.9.1/gems/rack-1.0.0/lib/rack/handler/webrick.rb:46:in `service'
        /usr/local/lib/ruby19/1.9.1/webrick/httpserver.rb:111:in `service'
        /usr/local/lib/ruby19/1.9.1/webrick/httpserver.rb:70:in `run'
        /usr/local/lib/ruby19/1.9.1/webrick/server.rb:183:in `block in start_thread'
    
    errors
    
    
    However, to recap: Does the Problem lie in the database connectors? This would mean one must wait for updated versions of the pg and probably mysql gem ...
    
    greets
    -act
    
  • Rocco Di Leo

    Rocco Di Leo August 26th, 2009 @ 04:02 PM

    Okay, i now reinstalled actionpack, rack and the pg-gem. Postgresql works now without any modification. Also the Encoding:: Error has disappeared. I don't know why the problem occured before but for now the problem is solved. When using MySQL, the 2.81-Version is needed as discussed. I will recheck on different machines and operation systems later since there must be some bug or unlucky condition somewhere which results in the problem before.

  • Manfred Stienstra

    Manfred Stienstra August 26th, 2009 @ 04:07 PM

    Rocco, first off: thanks for all the effort you're putting into this! Do you think you can do all your investigating first and post a short summary with proper formatting afterwards? It's becoming hard to find actual information in your torrent of posts.

  • engineerDave

    engineerDave September 23rd, 2009 @ 06:53 AM

    I get this error just by having quotes in the text being displayed.

    ActionView::TemplateError (incompatible character encodings: UTF-8 and ASCII-8BIT) on line #30
    ... app/views/blogs/index.html.erb:30:in concat'

    app/views/blogs/index.html.erb:30:in `block in _run_erb_app47views47blogs47index46html46erb'
    app/views/blogs/index.html.erb:27:in `each'
    app/views/blogs/index.html.erb:27
    app/controllers/blogs_controller.rb:33:in `index'
    <internal:prelude>:8:in `synchronize'
    thin (1.2.4) lib/thin/connection.rb:76:in `block in pre_process'
    thin (1.2.4) lib/thin/connection.rb:74:in `catch'
    thin (1.2.4) lib/thin/connection.rb:74:in `pre_process'
    thin (1.2.4) lib/thin/connection.rb:57:in `process'
    thin (1.2.4) lib/thin/connection.rb:42:in `receive_data'
    eventmachine (0.12.8) lib/eventmachine.rb:242:in `run_machine'
    eventmachine (0.12.8) lib/eventmachine.rb:242:in `run'
    thin (1.2.4) lib/thin/backends/base.rb:57:in `start'
    thin (1.2.4) lib/thin/server.rb:156:in `start'
    thin (1.2.4) lib/thin/controllers/controller.rb:80:in `start'
    thin (1.2.4) lib/thin/runner.rb:174:in `run_command'
    thin (1.2.4) lib/thin/runner.rb:140:in `run!'
    thin (1.2.4) bin/thin:6:in `<top (required)>'
    /usr/local/bin/thin:19:in `load'
    /usr/local/bin/thin:19:in `<main>'
    
  • Adam S

    Adam S September 23rd, 2009 @ 07:27 AM

    I'm not sure why people are using multibtye characters in most html... shouldn't you be using html entities? [1] Most (all?) html can be rendered using just the standard ASCII character set.

    I don't have any issues with encoding and the latest rails gems.

    Please try checking your app for bad encodings... [2] You may have some invisible encodings in your templates or be using a non-standard version of the quote char...

    [1] http://www.w3schools.com/tags/ref_entities.asp [2] http://github.com/adamsalter/bad_encodings-ruby19/tree

  • James Healy

    James Healy September 23rd, 2009 @ 07:34 AM

    "I'm not sure why people are using multibtye characters in most html... shouldn't you be using html entities?"

    There's nothing saying you should use entities is there (other than for reserved chars like &, etc)?

    Unicode has a hell of a lot more characters than there are HTML entities. As an example, what about asian, indic and arabic scripts?

  • yury

    yury October 31st, 2009 @ 08:40 PM

    +1 to Hector's workaround-erb.diff patch, works for me too.

  • Adam S

    Adam S November 9th, 2009 @ 01:10 AM

    This patch works for me (with Erb templates at least).

    Nathan Weizenbaum has just made a commit to fix this issue in HAML.

    http://github.com/nex3/haml/commit/76bd406875920079bb26445ddeb0d384...

    After thinking about this and spending quite a lot of time trying to track it down I think the best fix would be for Ruby1.9/Rails to include a encoding converter ASCII-8BIT <=> UTF-8.
    If Rails included this then it would fix all the rails issues anyway.
    Clearly UTF-8 to ASCII-8BIT is a no-op, it's essentially the same as using force_encoding, but ASCII-8BIT to UTF-8 would mean that you could depend on all data to be valid UTF-8. It would really make life so much easier.
    It would also meant that Rails didn't have to 'force_encoding' anything. It would use the natural encoding converter for any string and if people wanted to run in a different encoding they could still specify it on the command-line.
    For full support it would actually require ASCII-8BIT <=> 'chosen encoding', but UTF-8 would be a great start.
    I know almost nothing about adding encoding converters to Ruby1.9, but this seems like the most forward compatible change. Data would pass through all levels, Rack, DB, Rails, and be compatible (at least for UTF-8, initially).

  • hkstar

    hkstar November 18th, 2009 @ 07:03 AM

    • Assigned user changed from “Sven Fuchs” to “Jeremy Kemper”

    Can this be merged into 2.3-stable, please?

    It was freaking 6 months ago.

    Hector's workaround-erb.diff solved the problem and as far as I'm concerned UTF8 is the standard and everyone should use it. Opinionated software, remember?

    @Adam S: "I'm not sure why people are using multibtye characters in most html"

    What on earth are you talking about? Almost every language other than english has multibyte characters and they are, of course, going to be placed in HTML files. Where else would they go? What a ridiculous comment.

  • Jeremy Kemper

    Jeremy Kemper November 18th, 2009 @ 07:47 AM

    • State changed from “new” to “open”
    • Tag changed from 2.3-rc2, patch, ruby1.9, tested to patch, ruby1.9
    • Milestone changed from 2.x to 2.3.6

    The workaround is just as broken as it was six months ago. Please do investigate.

  • Vladimir Penkin

    Vladimir Penkin November 27th, 2009 @ 12:09 PM

    Rails 2.3.5 : Not working,
    Rails 2.3.5 + Hector patch: Working.

  • Jonas Nicklas

    Jonas Nicklas November 27th, 2009 @ 12:49 PM

    So the alternatives are:
    1) Pretty much every real world Rails app anywhere is broken on Ruby 1.9
    2) The patch is applied and we simply assume UTF-8 for templates. Which everyone uses anyway.

    How is that broken? Since no one has provided a better solution over the last six months, shouldn't we just apply this, and if someone needs to change the encoding used in templates, then they can patch it properly so we they choose the encoding.

    As mentioned above, Rails is oppinionated software, why can't we have an oppinion on what encoding people should use?

  • Michael Hasenstein

    Michael Hasenstein November 27th, 2009 @ 02:19 PM

    I applied the one-line patch to my just installed Rails 2.3.5 - but it does not help. Well, it does help with one issue: I no longer get an error when a partial is to be rendered. Instead I now get an error later, where I call a helper function in the view which <%= some_function() %> which returns some HTML.

    "incompatible character encodings: UTF-8 and ASCII-8BIT" once more.

    Given these issues, how can ANYONE be using ruby 1.9.1 at this point? Or are those who are able to use it using ASCII as default encoding for all files? I (most certainly!) use UTF-8, as it should be in this world. The ASCII-60s and 70s and maybe 80s are long over...

    I'm not (usually) concerned with the inner workings of ruby and rails, just use it (even though I consider myself "hard-core" in other fields I don't want to become an expert with everything). What I find disturbing is that I find no guidelines on Rails and Ruby 1.9.1. I just assumed it should be working by now, since I read a lot of "fixed ruby 1.9 compatibility issues" in Rails and Passenger.

    Does (all of) this discussion mean it isn't so, it's still experimental? I cannot imagine my application is very special.

  • Manfred Stienstra

    Manfred Stienstra November 27th, 2009 @ 02:31 PM

    • Tag cleared.

    Given these issues, how can ANYONE be using ruby 1.9.1 at this point?

    I assume nobody is running applications on 1.9. The encoding changes are in Ruby are pretty big and it will take a lot of work to resolve all the encoding issues in all libraries and Rails.

  • Mezza

    Mezza November 27th, 2009 @ 05:33 PM

    With regards to the postgres pg gem (not the pure ruby version), I originally encountered issues with encoding with the 0.8.0 version of the gem, but the developers of the gem seem to have applied a patch which works fine in the following branch:

    http://ruby-pg.rubyforge.org/svn/ruby-pg/branches/i17n-19-patches/

    The relevant issue is here:

    http://rubyforge.org/tracker/?func=detail&amp;aid=25931&amp;group_i...

  • Anton Ageev

    Anton Ageev November 27th, 2009 @ 05:37 PM

    I assume nobody is running applications on 1.9. The encoding changes are in Ruby are pretty big and it will take a lot of work to resolve all the encoding issues in all libraries and Rails.

    I am running rails application on 1.9.1.

    I use two monkey patches: config/initializers/fix_renderable.rb (Hector's patch) and config/initializers/fix_params.rb.

    And I patched postgres gem (http://github.com/antage/postgres) to force UTF-8 encoding for all strings returning from a database.

  • Valentin Nemcev

    Valentin Nemcev December 5th, 2009 @ 02:56 AM

    I'm also trying to run applications on 1.9.1. I'm not very familiar with rails internal structure, but I'm using it in few applications and i want to migrate them to ruby 1.9 to benefit from speed and memory efficiency (not talking about new Ruby features I want to use in future Rails projects).

    But I can't!

    I've tried all the patches and fixes I could find, but they are not working. I'm using Mysql for DB and Haml for templates and I get "incompatible character encodings: UTF-8 and ASCII-8BIT" when I try to render model attribute with Russian letters. Other UTF-8 strings are okay.

    What additional information should I provide to help fixing this issue?

  • trevor

    trevor December 11th, 2009 @ 07:57 PM

    +1 Rails 2.3.5 + Hector patch: Working.

    solved my problem with render partial and μm.

  • Thilo Utke

    Thilo Utke December 17th, 2009 @ 12:16 AM

    +1 Rails 2.3.5 + Hector patch is working for me too

  • James Conroy-Finn

    James Conroy-Finn December 17th, 2009 @ 12:40 PM

    @Jakub Instructions on patching pg to return UTF-8 strings are here: http://gist.github.com/215955 (the diff is http://gist.github.com/215956)

  • Andrew Grim

    Andrew Grim December 21st, 2009 @ 08:15 PM

    Hector's patch works in the case where your default encoding is UTF-8, but doesn't respect the encoding specified by template itself. Using the latest tests in rails I was able to achieve both with this patch. It only affects ERB, but I believe that is where the bug lies anyway. ERB#src will always return strings encoded as either ASCII or ASCII-8BIT, regardless of both your default encoding and the encoding specified by the ERB string.

    This doesn't appear to be an issue with rails 3 as Erubis is used by default, and the bug seems to be ERB specific.

    Attached is a patch for 2-3-stable and also a little script that demonstrates the issue (for fun, change the script's encoding to ASCII-8BIT).

  • Andreas Haller

    Andreas Haller January 21st, 2010 @ 07:04 PM

    ERB#src will always return strings encoded as either ASCII or ASCII-8BIT, regardless of both your default encoding and the encoding specified by the ERB string.

    Is there a bug about this on ruby-lang.org?

    Erb#src seems to behave strange, but rendering with Erb seems to just work.
    At least on ruby 1.9.2dev (2010-01-22 trunk 26370) [i386-darwin9.8.0]

    # encoding: UTF-8
    require 'erb'
    template = ERB.new("This is IntéraΫiὉnäl Pöחyß")
    puts template.src.encoding       # US-ASCII                     # This is not expected…
    puts template.result             # This is IntéraΫiὉnäl Pöחyß   # … but it just works.
    puts template.result.encoding    # UTF-8                        # This is just works, doesn't it?
    
  • Vladimir Penkin

    Vladimir Penkin February 3rd, 2010 @ 07:26 AM

    • Assigned user cleared.

    I'm still having issues with UTF.
    With this patches:
    - mysql.rb - fix_renderable.rb - fix_params.rb

    Having troubles when trying to POST russian characters to controller.

  • kdgundermann

    kdgundermann February 23rd, 2010 @ 04:39 PM

    • Tag set to encoding, utf8
  • Marcello Barnaba

    Marcello Barnaba March 21st, 2010 @ 12:15 PM

    Hello,

    here is my monkey patch (hack? :-) to fix this issue on current Rails 2.3.5 apps on 1.9.1, that doesn't involve copy-pasting code from ActionView. It is also available as a Gist on GitHub.

    # Rails 2.3.5, Ruby 1.9. ERB returns templates with an ASCII-8BIT encoding, unless they contain
    # an unicode character, and when you render a partial with unicode chars into a layout without,
    # the infamous "incompatible character encodings: ASCII-8BIT and UTF-8" error comes out.
    #
    # This module monkey-patches module_eval into the ActionView::Base::CompiledTemplates module to
    # convert the first argument encoding to UTF-8, if needed.
    #
    # Put it into lib/patches/compiled_templates.rb and require it into the config.after_initialize
    # block of your environment.rb.
    #
    # LH ticket x-reference: https://rails.lighthouseapp.com/projects/8994/tickets/2188
    #
    # - vjt@openssl.it
    #
    module Patches
      module CompiledTemplates
        def self.extended(base)
          base.metaclass.alias_method_chain(:module_eval, :utf8)
        end
     
        def module_eval_with_utf8(*args, &block)
          if args.first.respond_to?(:encoding) && args.first.encoding != Encoding::UTF_8
            args.first.force_encoding(Encoding::UTF_8)
          end
          module_eval_without_utf8(*args, &block)
        end
      end
     
      begin
        RUBY_VERSION.to_f >= 1.9 &&
          ActionView::Base::CompiledTemplates.method(:module_eval_with_utf8)
      rescue NameError
        ActionView::Base::CompiledTemplates.extend Patches::CompiledTemplates
      end
    end
    

    Tested on 1.9.1-p378 and a big Rails app with unicode characters in templates :-).

  • Ivan Ukhov

    Ivan Ukhov April 7th, 2010 @ 12:55 AM

    Here is my solution for HAML (http://gist.github.com/358275):

    module Haml
      class Buffer
        class UTF8String < String
          def << text; super text.toutf8; end
        end
    
        alias original_initialize initialize
    
        def initialize *args
          original_initialize *args
          @buffer = UTF8String.new
        end
      end
    end
    
  • Alberto Fernández Capel

    Alberto Fernández Capel April 9th, 2010 @ 01:39 AM

    There seems to be a problem when returning an UTF8 string from an erb tag, like this

     <%= "Hasta mañana" %>
    

    I traced the problem to action_view/template_handlers/erb.rb. There, ERB.new always return a ASCII-8BIT string when asked for its ruby source. When the view concat this string with the UTF8 string from the tag you get the infamous error.

    Attached is a patch with a failing test case and a possible solution. Hope it helps!

  • Cezary Baginski

    Cezary Baginski April 25th, 2010 @ 12:15 AM

    Changeset with workaround for ERB in an encoding-friendly way.

    • rebased to 2-3-stable
    • uses encoding comment handling from Andrew Grim's patch
    • introduces new concept for handling encodings from non-rails sources (external_encode!)
    • test cases
    • ActionPack works with -Ku, -Ks, -Ke, -Kn and without -K (as long as #4466 is applied also)

    Comments, feedback, questions are more than welcome ;)

    • for Haml, MySQL, db, rack encoding issues, see other tickets (I might provide a summary soon) - this patch fixes ONLY the ERB handler!

    Thanks to everyone, who helped nail this issue.

  • Jeremy Kemper

    Jeremy Kemper April 25th, 2010 @ 01:29 AM

    • Assigned user set to “Jeremy Kemper”

    Great, working through this. I just backported master changes to 2.3 so I'll rebase it again.

    New development needs to start in master and move back to 2.3, also. Can't have a solution on 2.3 but none on 3.0.

  • Cezary Baginski

    Cezary Baginski April 25th, 2010 @ 03:38 AM

    I'll be glad to switch efforts to 3.0 if this patch turns out to be ok.

  • Cezary Baginski

    Cezary Baginski April 27th, 2010 @ 04:29 PM

    • Tag changed from encoding, utf8 to encoding, patch, utf8

    Patch for Rails 3.0

    activesupport:

    • rewrote the external_encode! function from scratch
    • added more test cases

    actionpack:

    • added test for line numbering (rendering errors)
    • got erb to work with all Ruby options (-Ks, -Ke, -Ku, -Kn, normal, us-ascii)
    • tries to play nice with any encoding settings or templates without requiring magic comments
    • transcode internally to utf-8 because Ruby's concat cannot really transcode and will fail anyway
    • slight cleanup in render encoding test cases

    I'll backport this to 2-3-stable (instead of the previous patch) if everything is ok.

  • Yaroslav Markin

    Yaroslav Markin April 28th, 2010 @ 11:39 PM

    Is there any chance we can have a Rails::Configuration key like

    ...
    config.action_view.encoding = "utf8"
    ...
    

    to skip on defining encoding in each and every template? Would be really handy IMO.

  • Cezary Baginski

    Cezary Baginski April 29th, 2010 @ 09:45 AM

    YES! By all means!

    If you have no magic comments, Ruby's own default encoding: External.default_external is assumed.

    You can set this inside your application with:

      Encoding.default_external = Encoding::UTF_8 if RUBY_VERSION > "1.9"
    

    If for some reason you want to set it outside the application, you can:

    1. Change it in the command line or shebang line of the server you are running using Ruby's -E or -K option, e.g:
        # /usr/bin/ruby -Ku
      
    2. Use environment variables (useful if your production has a non-utf8 locale, like LANG="C"):
         LC_CTYPE=en_US.UTF-8 LANG=en_US.UTF-8 start_my_server
      
    3. Set UTF-8 for all Ruby applications, by setting this for your shell:
         export RUBYOPT=-Ku
      
  • The_Lord

    The_Lord April 29th, 2010 @ 11:46 AM

    You could also wait for Rails 3.0 or start having fun with the current beta. In 3.0.0.beta3 there already is such a line in the application.rb file:

    "config.encoding = "utf-8"
    

    Works great :)

  • Cezary Baginski

    Cezary Baginski May 10th, 2010 @ 09:03 PM

    • Assigned user changed from “Jeremy Kemper” to “Cezary Baginski”

    I did my homework on m17n in Ruby and talked a lot with Yehuda. I'll redo the patches from scratch, since I the above aren't as they should be.

    I'll probably open a new ticket for Rails 3.0 once I have a proper solution for 2.3, but I will try to keep the patches as identical as possible.

    If I find any other related issues, but not ERB specific, I'll open new tickets for patches.

  • Cezary Baginski

    Cezary Baginski May 12th, 2010 @ 01:26 PM

    • Assigned user changed from “Cezary Baginski” to “Jeremy Kemper”

    New ticket for the Rails 3 version of the patch: #4582

    The following patch is rebased to 2-3-stable and hopefully solves all the Erb encoding issues with Ruby 1.9 and this ticket can be closed.

    Test cases pass with 1.9 using -Ks, -Ku, -Kn, -Ke and 1.8.

    For non Rails 2-3.X Erb specific issues (Haml, DB, Rack, Rails 3), please find existing tickets or create new ones.

    Possible problems with the patch:

    • adds a convenience method to String which may look like overkill, but it is used twice and may be useful for other templating engines. Rails encoding support is an ongoing issue anyway and similar functionality will probably be required anyway.

    • line numbering in errors may need better coverage

    • requiring the contents of all templates to already match the internal encoding might be a cleaner solution to this kind of problem, but would be less flexible

    • test cases may need improvements

  • Jonas Nicklas

    Jonas Nicklas May 12th, 2010 @ 02:06 PM

    There is a method called external_encode which is documented as "Encode to internal encoding". That seems pretty strange, but I can't say I understand much of what that method does, so maybe it's correct?

  • Cezary Baginski

    Cezary Baginski May 12th, 2010 @ 03:03 PM

    • Tag changed from encoding, patch, utf8 to encoding, erb, patch, utf8

    Ruby 1.9 has Encoding.default_external and Encoding.default_internal and when the second is nil, the first is used. So encoding to 'internal' is usually the same as encoding to 'external' in most ruby environments.

    A method name to say what it does would be:

    encode-from-given-param-or-default_external-to-default_internal

    It works like Ruby's "encode", except it tries to handle additional cases with binary (ASCII-8BIT) strings, and catching things encode(dst,src) would not.

    Maybe it should be called encode_external, encode_to_internal, etc?

    Any suggestions for better names, implementations, documentation are welcome.

  • Rizwan Reza

    Rizwan Reza May 16th, 2010 @ 02:41 AM

    • Tag changed from encoding, erb, patch, utf8 to bugmash, encoding, erb, patch, utf8
  • Jeremy Kemper

    Jeremy Kemper May 23rd, 2010 @ 05:54 PM

    • Milestone changed from 2.3.6 to 2.3.7
  • Jeremy Kemper

    Jeremy Kemper May 24th, 2010 @ 09:40 AM

    • Milestone changed from 2.3.7 to 2.3.8
  • Jeremy Kemper

    Jeremy Kemper May 25th, 2010 @ 11:45 PM

    • Milestone changed from 2.3.8 to 2.3.9
  • Stefano Diem

    Stefano Diem June 13th, 2010 @ 06:09 AM

    This error also presents itself under rails 3.0.0.beta3 and rails 3.0.0.beta4.
    Since Haml uses Erb::Utils to escape strings, overriding html_escape in application.rb worked for me with both ERB and Haml:

    module ERB::Util
      def html_escape(s)
        s = s.to_s.force_encoding("utf-8")
        if s.html_safe?
          s
        else
          s.gsub(/[&"><]/) { |special| HTML_ESCAPE[special] }.html_safe
        end
      end
    end
    

    (Not trying to verify if it is 1.9 or do test cases, just putting the simplest and comprehensive way i got to make it work so maybe it can help someone do a proper patch)

  • Jeremy Kemper

    Jeremy Kemper August 30th, 2010 @ 02:28 AM

    • Milestone changed from 2.3.9 to 2.3.10
  • Damien MATHIEU

    Damien MATHIEU September 28th, 2010 @ 12:37 PM

    I see this same problem in rails3.
    Stefano's solution solves it. There's one problem with it though. It won't work if the string is frozen.

  • wout

    wout September 29th, 2010 @ 10:33 AM

    I'm having this problem as well with a rails 3 app.

  • wout

    wout September 29th, 2010 @ 12:09 PM

    Stefano's solution worked for me but I had to rework it a little to get it working with frozen strings:

    module ERB::Util
      def html_escape(s)
        frozen = s.frozen?
        
        s = s.dup if frozen
        s = s.to_s.force_encoding("utf-8")
        s = s.gsub(/[&"><]/) { |special| HTML_ESCAPE[special] }.html_safe unless s.html_safe?
        s.freeze if frozen
        
        s
      end
    end
    

    Not the nicest solution but it fixes my production app.

  • wout

    wout October 1st, 2010 @ 08:05 PM

    The previous workaround worked ok for the views, but it's more of a patch than a solution. The only thing that finally worked for me was installing the ruby-mysql gem:

    gem 'ruby-mysql'

    Finally, make sure you set the encoding of your database to UTF-8. Otherwise you will keep getting errors anyway.

  • Ryan Bigg

    Ryan Bigg October 9th, 2010 @ 10:02 PM

    • Tag cleared.
    • Importance changed from “” to “Low”

    Automatic cleanup of spam.

  • Trung LE

    Trung LE October 31st, 2010 @ 05:58 AM

    Can someone here ban the dodgy chinese spammer 'wangxindan'?

  • Jeff Kreeftmeijer
  • Ryan Bigg

    Ryan Bigg November 8th, 2010 @ 01:50 AM

    Automatic cleanup of spam.

  • Ryan Bigg

    Ryan Bigg November 8th, 2010 @ 01:51 AM

    Automatic cleanup of spam.

  • Ryan Bigg

    Ryan Bigg November 8th, 2010 @ 01:53 AM

    Automatic cleanup of spam.

  • Jeff Kreeftmeijer
  • joost baaij

    joost baaij February 22nd, 2011 @ 10:45 AM

    FWIW, this gist contains all patches for Rails 2.3 so this problem goes away.
    Note that it doesn't patch the mysql gem since you should use the mysql2 gem on Ruby 1.9 anyway.

    https://gist.github.com/838489

  • Andrew Selder

    Andrew Selder April 27th, 2011 @ 08:12 PM

    I've updated the normalize parameters portion of joost's patch.

    This checks to make sure that the parameter is acutally valid in the UTF-8 encoding. If it's not, it tries to interpret the parameter as ISO-8859-1 and then transcode it to UTF-8.

    We're getting lots of query parameters like:
    * URL : http://www.blah.com?&amp;category=concerts&headliner=Charles%20Dub%E9abc

    where there is a 0xE9 byte encoded in the URL. Legal ISO-8859-1, but not UTF-8

    https://gist.github.com/944943

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile »

<h2 style="font-size: 14px">Tickets have moved to Github</h2>

The new ticket tracker is available at <a href="https://github.com/rails/rails/issues">https://github.com/rails/rails/issues</a>

Referenced by

Pages