This project is archived and is in readonly mode.
Encoding error in Ruby1.9 for templates
Reported by Jonas Nicklas | March 9th, 2009 @ 11:44 PM | in 2.3.10
In Ruby 1.9 translating Strings which have non-ascii characters in them does not work for me.
If I have keys like this is my translation file:
"sv":
test1: blah
test2: blåh
Calling this works fine:
I18n.translate(:test1)
However, calling this raises an exception:
I18n.translate(:test2)
Here's the error:
ActionView::TemplateError (incompatible character encodings: ASCII-8BIT and UTF-8)
This is the same error as in #2038, I did run this against edge Rails and it looked like the patch from #2038 has been applied, so I am assuming this is a different issue.
Comments and changes to this ticket
-
Jonas Nicklas March 18th, 2009 @ 10:19 PM
- Tag changed from 2.3-rc2, i18n to 2.3-rc2, ruby1.9
- Title changed from i18n fails with multibyte Strings in Ruby 1.9 (similar to #2038) to Encoding error in Ruby1.9 for templates
I figured out now that I18n isn't the culprit. I18n.t returns a UTF-8 string, the issue seems to be that templates by default are ASCII-8BIT encoded, and when a UTF-8 string is used they switch over.
<%= "å" %><%= "å".encoding %>
Works, and would return 'åASCII-8BIT'
<%= "å".force_encoding('utf-8') %>
Also works. However:
<%= "å" %><%= "å".force_encoding('utf-8') %>
Fails with the above mentioned error.
I have attached a test case that proves the bug.
-
Mauricio Eduardo Szabo March 26th, 2009 @ 01:36 AM
I confirm this error on Rails 2.3.2 and Ruby1.9.
If, for example, I add on one controller: @errors = ["Á", 'Bê']
On any view, a simple: <%= @errors.inspect %>
throws the error incompatible character encodings: ASCII-8BIT and UTF-8
-
Hector E. Gomez Morales March 27th, 2009 @ 06:58 PM
- Tag changed from 2.3-rc2, ruby1.9 to 2.3-rc2, patch, ruby1.9
The problem is erb code in ruby 1.9 distribution. When it compiles the template code it forces a 'ASCII-8bit' encoding, the problem is when the template code has multibyte characters the template code is returned in a 'ASCII-8bit' string and when this string is concat with a 'UTF8' string with multibyte character the exception is raised because the strings between this encodings are only compatible when both only have seven-bit characters.
This patch is the result of my research for my proposal for end to end encoding support for rails. I am working for a patch for erb to resolve this problem. The included patch is quick hack to force the encoding of the template method code to be utf-8. #1988 is a duplicate of this bug.
-
crazy_bug (at terletzki) April 6th, 2009 @ 01:57 PM
Hello! We've got the same problem! Only the error occurs when we fetch data from the database. We're using Mysql and Charset is UTF-8, but the Active Record returns ASCII-8BIT. Is it possible to do similar changes to the activerecord as you did to the actionpack? Seems as we're not the only ones with that problem (http://groups.google.com/group/r...). Can somebody help me with this? Thanks!
-
Hector E. Gomez Morales April 10th, 2009 @ 04:38 PM
Hi, sorry to be so late but I got some solutions to this problem please take a look to #2476
-
Mauricio Eduardo Szabo April 13th, 2009 @ 02:18 PM
Hector, sorry but this is not my problem. My problem is not when I fetch data from a database, it's on template rendering, as I shown on my previous post. The ERB Workaround, by the way, worked for me.
(By the way, I use the postgres-pr adapter to fetch data from my database)
-
Portfonica April 20th, 2009 @ 01:07 AM
I'm afraid hector's patch doesn't resolve the problem with another template system such as HAML. :-/ So I think this patch isn't useful.
-
Hector E. Gomez Morales April 21st, 2009 @ 01:14 AM
This patch is concerned with erb as the default templating engine, that I think a lot of people use. If you have a particular haml template that presents the same problem can you provide it so I can dig out the proper fix.
-
qoobaa May 12th, 2009 @ 11:31 PM
- Tag changed from 2.3-rc2, patch, ruby1.9 to 2.3-rc2, patch, ruby1.9, tested
I've created the patch that fixes problems described by Jonas Nicklas. Now everything in views is encoded using UTF-8. The bad news are that a lot of things are broken now. Described problems with HAML may be caused by Rack params encoding (ASCII-8BIT), sqlite3-ruby strings encoding (ASCII-8BIT). I've created the ticket in Rack's lighthouse, we need also to fix sqlite3-ruby gem. Does anybody use mysql or pg gems? Are they broken also?
-
Manfred Stienstra May 13th, 2009 @ 04:46 PM
Also, the utf_8_encoding patch assumes that everyone will want to use UTF-8 in their templates, this might not be the case.
-
qoobaa May 13th, 2009 @ 05:14 PM
UTF-8 encoding may be changed easily (we can put ome variable there), but we've to provide some configuration for that (in environment.rb?). I've fixed issue in sqlite3-ruby gem (http://github.com/qoobaa/sqlite3-ruby), however it has no UTF-16 support yet (to be done). I've tried to fix pg gem, but I need to read Posgtres documentation first to do it (the version in my repository uses UTF-8 as default encoding). I've also created the ticket on Rack's lighthouse.
-
Portfonica May 26th, 2009 @ 12:41 PM
Hector: I don't any solution, I have only a hack. You can put those lines into your environment.rb
Encoding.default_internal = 'utf-8'
Encoding.default_external = 'utf-8'Oh, hack != solution :)
-
Anton Ageev May 31st, 2009 @ 12:54 PM
Strings in params[] have ASCII-8BIT encoding too. Is it Rack issue?
-
Anton Ageev May 31st, 2009 @ 01:00 PM
Hector: I don't any solution, I have only a hack. You can put those lines into your environment.rb
Encoding.default_internal = 'utf-8' Encoding.default_external = 'utf-8'
This doesn't work for me.
-
qoobaa May 31st, 2009 @ 01:42 PM
The params ASCII-8BIT encoding is a Rack issue: http://rack.lighthouseapp.com/projects/22435/tickets/48-rackutilsun...
Changing the default internal and external encoding also doesn't work in my app.
-
Adam S July 22nd, 2009 @ 04:56 PM
I'm also seeing this issue with Ruby 1.9 and HAML templates.
It's very annoying and confusing... I think the only solution is for Rails to set a default encoding in environment.rb and then do translation from other encodings...
Raising errors for every encoding type is silly.
I basically want my whole app to use utf8, others may want another encoding, then fine just put it in environment.rb. -
Jérôme August 16th, 2009 @ 12:44 AM
It would be just definitely great if rails could avoid us editing all our files containing unicode characters, all ruby files. I feel like getting a regression with ruby1.9 when I have to add a # encoding: utf-8 header to my hundreds of files...
-
Rocco Di Leo August 26th, 2009 @ 03:44 AM
I also would like to see a environment line where one can set the application wide encoding instead of adding magic comments to all files. Also i did not manage to add magic comments to the .erb files, how would this work?
At least those two possibilities did not work for me:
<%#= encoding: utf-8 %>
<%# encoding: utf-8 %>-act
-
Jeremy Kemper August 26th, 2009 @ 06:58 AM
You guys aren't saying where you get these Encoding errors. Please include backtraces or, better, failing test cases so we can reproduce.
UTF-8 is already the default external encoding. The magic comments are only if you want to write a template in a different encoding than the default.
-
Rocco Di Leo August 26th, 2009 @ 01:32 PM
Reproduce the problem by using this process
rails utf8errors -d mysql # add credentials if necessary to config/database.yml cd utf8errors rake db:create script/generate controller utf8errors index script/generate model user # add "t.string :name" to the migration file before next step rake db:migrate touch app/views/utf8errors/_partial.html.erb echo "Multibyte String öäü works here" >> app/views/utf8errors/_partial.html.erb echo "Multibyte String öäü works here" >> app/views/utf8errors/_partial.html.erb echo "Inserting User with multibyte characters" >> app/views/utf8errors/_partial.html.erb echo "<% User.create(:name => 'Multibyte Username öäü') %>" >> app/views/utf8errors/_partial.html.erb echo "Multibyte String from database does NOT work now:" >> app/views/utf8errors/_partial.html.erb echo "<% User.all.each do |u| %>" >> app/views/utf8errors/_partial.html.erb echo "<%= u.name %>" >> app/views/utf8errors/_partial.html.erb echo "<% end %>" >> app/views/utf8errors/_partial.html.erb echo "<%= render :partial => 'partial' %>" >> app/views/utf8errors/index.html.erb
take care to use ruby 1.9.1 when starting the server
./script/server
=> surf to http://127.0.0.1:3000/utf8errors # should display the error
The error does NOT appear when using the workaround patch by hector which adds the line source.force_encoding('utf-8') if '1.9'.respond_to?(:force_encoding) to the actionpack/lib/action_view/renderable.rb
Backtrace:
4: <% User.create(:name => 'Multibyte Username öäü') %>
5: Multibyte String from database does NOT work now:
6: <% User.all.each do |u| %>
7: <%= u.name %>
8: <% end %>app/views/utf8errors/_partial.html.erb:7:in `concat' app/views/utf8errors/_partial.html.erb:7:in `block in _run_erb_app47views47utf8errors47_partial46html46erb_locals_object_partial' app/views/utf8errors/_partial.html.erb:6:in `each' app/views/utf8errors/_partial.html.erb:6 app/views/utf8errors/index.html.erb:3 <internal:prelude>:8:in `synchronize' /usr/local/lib/ruby19/1.9.1/webrick/httpserver.rb:111:in `service' /usr/local/lib/ruby19/1.9.1/webrick/httpserver.rb:70:in `run' /usr/local/lib/ruby19/1.9.1/webrick/server.rb:183:in `block in start_thread'
Rendered rescues/trace (80.9ms)
Rendered rescues/request_and_response (0.9ms)
Rendering rescues/layout (internal_server_error)I hope this helps and that the formatting is working...
-act -
Rocco Di Leo August 26th, 2009 @ 01:35 PM
okay the formatting is kinda broken but i think you get the idea ... in addition, it should fail with Ruby 1.9 (instead of 1.9.1) as well. Also the problem arises with postgresql too (havent tested sqlite3.
-act
-
Rocco Di Leo August 26th, 2009 @ 02:11 PM
One more note. I just rechecked this process with postgresql 8.4 and in this case the workaround-erb patch by hector is NOT working. Sorry for the confusion before. So summarized for my Setup:
Ruby 1.9.x + Rails 2.3.3 + Mysql 5.1 => not working
Ruby 1.9.x + Rails 2.3.3 + with hector patch + Mysql 5.1 => working
Ruby 1.9.x + Rails 2.3.3 (with and without hector patch) + Postgresql 8.4 => not workingGreets
-act -
Adam S August 26th, 2009 @ 02:11 PM
Are you sure this isn't the mysql gem?
I used the process here: and I can create multibyte users etc. In fact no real issues with multibyte now...
http://www.taylorluk.com/articles/2009/08/12/ruby-19-and-passenger
Also used this adapter for sqlite3:
-
Rocco Di Leo August 26th, 2009 @ 03:25 PM
thank you, i updated from mysql gem 2.7 to the self-built 2.81 .. with the process above i got the error 'uninitialized constant Encoding::UTF' accessing http://localhost:3000/utf8errors using Rails 2.3.3
when i changed in Rack::utils.rb
RUBY_VERSION >= "1.9" ? result.force_encoding(Encoding::UTF-8) : result
for
RUBY_VERSION >= "1.9" ? result.force_encoding('utf-8') : result
the rendering worked indeed.
Here is the output without alteration for the interested:
[2009-08-26 16:19:01] ERROR NameError: uninitialized constant Encoding::UTF /usr/local/lib/ruby19/gems/1.9.1/gems/activesupport-2.3.3/lib/active_support/dependencies.rb:105:in `rescue in const_missing' /usr/local/lib/ruby19/gems/1.9.1/gems/activesupport-2.3.3/lib/active_support/dependencies.rb:94:in `const_missing' /usr/local/lib/ruby19/gems/1.9.1/gems/rack-1.0.0/lib/rack/utils.rb:27:in `unescape' /usr/local/lib/ruby19/gems/1.9.1/gems/rails-2.3.3/lib/rails/rack/static.rb:36:in `file_exist?' /usr/local/lib/ruby19/gems/1.9.1/gems/rails-2.3.3/lib/rails/rack/static.rb:18:in `call' /usr/local/lib/ruby19/gems/1.9.1/gems/rack-1.0.0/lib/rack/urlmap.rb:46:in `block in call' /usr/local/lib/ruby19/gems/1.9.1/gems/rack-1.0.0/lib/rack/urlmap.rb:40:in `each' /usr/local/lib/ruby19/gems/1.9.1/gems/rack-1.0.0/lib/rack/urlmap.rb:40:in `call' /usr/local/lib/ruby19/gems/1.9.1/gems/rails-2.3.3/lib/rails/rack/log_tailer.rb:17:in `call' /usr/local/lib/ruby19/gems/1.9.1/gems/rack-1.0.0/lib/rack/content_length.rb:13:in `call' /usr/local/lib/ruby19/gems/1.9.1/gems/rack-1.0.0/lib/rack/handler/webrick.rb:46:in `service' /usr/local/lib/ruby19/1.9.1/webrick/httpserver.rb:111:in `service' /usr/local/lib/ruby19/1.9.1/webrick/httpserver.rb:70:in `run' /usr/local/lib/ruby19/1.9.1/webrick/server.rb:183:in `block in start_thread' errors However, to recap: Does the Problem lie in the database connectors? This would mean one must wait for updated versions of the pg and probably mysql gem ... greets -act
-
Rocco Di Leo August 26th, 2009 @ 04:02 PM
Okay, i now reinstalled actionpack, rack and the pg-gem. Postgresql works now without any modification. Also the Encoding:: Error has disappeared. I don't know why the problem occured before but for now the problem is solved. When using MySQL, the 2.81-Version is needed as discussed. I will recheck on different machines and operation systems later since there must be some bug or unlucky condition somewhere which results in the problem before.
-
Manfred Stienstra August 26th, 2009 @ 04:07 PM
Rocco, first off: thanks for all the effort you're putting into this! Do you think you can do all your investigating first and post a short summary with proper formatting afterwards? It's becoming hard to find actual information in your torrent of posts.
-
engineerDave September 23rd, 2009 @ 06:53 AM
I get this error just by having quotes in the text being displayed.
ActionView::TemplateError (incompatible character encodings: UTF-8 and ASCII-8BIT) on line #30
... app/views/blogs/index.html.erb:30:inconcat'
app/views/blogs/index.html.erb:30:in `block in _run_erb_app47views47blogs47index46html46erb' app/views/blogs/index.html.erb:27:in `each' app/views/blogs/index.html.erb:27 app/controllers/blogs_controller.rb:33:in `index' <internal:prelude>:8:in `synchronize' thin (1.2.4) lib/thin/connection.rb:76:in `block in pre_process' thin (1.2.4) lib/thin/connection.rb:74:in `catch' thin (1.2.4) lib/thin/connection.rb:74:in `pre_process' thin (1.2.4) lib/thin/connection.rb:57:in `process' thin (1.2.4) lib/thin/connection.rb:42:in `receive_data' eventmachine (0.12.8) lib/eventmachine.rb:242:in `run_machine' eventmachine (0.12.8) lib/eventmachine.rb:242:in `run' thin (1.2.4) lib/thin/backends/base.rb:57:in `start' thin (1.2.4) lib/thin/server.rb:156:in `start' thin (1.2.4) lib/thin/controllers/controller.rb:80:in `start' thin (1.2.4) lib/thin/runner.rb:174:in `run_command' thin (1.2.4) lib/thin/runner.rb:140:in `run!' thin (1.2.4) bin/thin:6:in `<top (required)>' /usr/local/bin/thin:19:in `load' /usr/local/bin/thin:19:in `<main>'
-
Adam S September 23rd, 2009 @ 07:27 AM
I'm not sure why people are using multibtye characters in most html... shouldn't you be using html entities? [1] Most (all?) html can be rendered using just the standard ASCII character set.
I don't have any issues with encoding and the latest rails gems.
Please try checking your app for bad encodings... [2] You may have some invisible encodings in your templates or be using a non-standard version of the quote char...
[1] http://www.w3schools.com/tags/ref_entities.asp [2] http://github.com/adamsalter/bad_encodings-ruby19/tree
-
James Healy September 23rd, 2009 @ 07:34 AM
"I'm not sure why people are using multibtye characters in most html... shouldn't you be using html entities?"
There's nothing saying you should use entities is there (other than for reserved chars like &, etc)?
Unicode has a hell of a lot more characters than there are HTML entities. As an example, what about asian, indic and arabic scripts?
-
Adam S November 9th, 2009 @ 01:10 AM
This patch works for me (with Erb templates at least).
Nathan Weizenbaum has just made a commit to fix this issue in HAML.
http://github.com/nex3/haml/commit/76bd406875920079bb26445ddeb0d384...
After thinking about this and spending quite a lot of time trying to track it down I think the best fix would be for Ruby1.9/Rails to include a encoding converter ASCII-8BIT <=> UTF-8.
If Rails included this then it would fix all the rails issues anyway.
Clearly UTF-8 to ASCII-8BIT is a no-op, it's essentially the same as using force_encoding, but ASCII-8BIT to UTF-8 would mean that you could depend on all data to be valid UTF-8. It would really make life so much easier.
It would also meant that Rails didn't have to 'force_encoding' anything. It would use the natural encoding converter for any string and if people wanted to run in a different encoding they could still specify it on the command-line.
For full support it would actually require ASCII-8BIT <=> 'chosen encoding', but UTF-8 would be a great start.
I know almost nothing about adding encoding converters to Ruby1.9, but this seems like the most forward compatible change. Data would pass through all levels, Rack, DB, Rails, and be compatible (at least for UTF-8, initially). -
hkstar November 18th, 2009 @ 07:03 AM
- Assigned user changed from Sven Fuchs to Jeremy Kemper
Can this be merged into 2.3-stable, please?
It was freaking 6 months ago.
Hector's workaround-erb.diff solved the problem and as far as I'm concerned UTF8 is the standard and everyone should use it. Opinionated software, remember?
@Adam S: "I'm not sure why people are using multibtye characters in most html"
What on earth are you talking about? Almost every language other than english has multibyte characters and they are, of course, going to be placed in HTML files. Where else would they go? What a ridiculous comment.
-
Jeremy Kemper November 18th, 2009 @ 07:47 AM
- State changed from new to open
- Tag changed from 2.3-rc2, patch, ruby1.9, tested to patch, ruby1.9
- Milestone changed from 2.x to 2.3.6
The workaround is just as broken as it was six months ago. Please do investigate.
-
Deleted User November 27th, 2009 @ 12:09 PM
Rails 2.3.5 : Not working,
Rails 2.3.5 + Hector patch: Working. -
Jonas Nicklas November 27th, 2009 @ 12:49 PM
So the alternatives are:
1) Pretty much every real world Rails app anywhere is broken on Ruby 1.9
2) The patch is applied and we simply assume UTF-8 for templates. Which everyone uses anyway.How is that broken? Since no one has provided a better solution over the last six months, shouldn't we just apply this, and if someone needs to change the encoding used in templates, then they can patch it properly so we they choose the encoding.
As mentioned above, Rails is oppinionated software, why can't we have an oppinion on what encoding people should use?
-
Michael Hasenstein November 27th, 2009 @ 02:19 PM
I applied the one-line patch to my just installed Rails 2.3.5 - but it does not help. Well, it does help with one issue: I no longer get an error when a partial is to be rendered. Instead I now get an error later, where I call a helper function in the view which <%= some_function() %> which returns some HTML.
"incompatible character encodings: UTF-8 and ASCII-8BIT" once more.
Given these issues, how can ANYONE be using ruby 1.9.1 at this point? Or are those who are able to use it using ASCII as default encoding for all files? I (most certainly!) use UTF-8, as it should be in this world. The ASCII-60s and 70s and maybe 80s are long over...
I'm not (usually) concerned with the inner workings of ruby and rails, just use it (even though I consider myself "hard-core" in other fields I don't want to become an expert with everything). What I find disturbing is that I find no guidelines on Rails and Ruby 1.9.1. I just assumed it should be working by now, since I read a lot of "fixed ruby 1.9 compatibility issues" in Rails and Passenger.
Does (all of) this discussion mean it isn't so, it's still experimental? I cannot imagine my application is very special.
-
Manfred Stienstra November 27th, 2009 @ 02:31 PM
- Tag cleared.
Given these issues, how can ANYONE be using ruby 1.9.1 at this point?
I assume nobody is running applications on 1.9. The encoding changes are in Ruby are pretty big and it will take a lot of work to resolve all the encoding issues in all libraries and Rails.
-
Mezza November 27th, 2009 @ 05:33 PM
With regards to the postgres pg gem (not the pure ruby version), I originally encountered issues with encoding with the 0.8.0 version of the gem, but the developers of the gem seem to have applied a patch which works fine in the following branch:
http://ruby-pg.rubyforge.org/svn/ruby-pg/branches/i17n-19-patches/
The relevant issue is here:
http://rubyforge.org/tracker/?func=detail&aid=25931&group_i...
-
Anton Ageev November 27th, 2009 @ 05:37 PM
I assume nobody is running applications on 1.9. The encoding changes are in Ruby are pretty big and it will take a lot of work to resolve all the encoding issues in all libraries and Rails.
I am running rails application on 1.9.1.
I use two monkey patches:
config/initializers/fix_renderable.rb
(Hector's patch) andconfig/initializers/fix_params.rb
.And I patched postgres gem (http://github.com/antage/postgres) to force UTF-8 encoding for all strings returning from a database.
-
Valentin Nemcev December 5th, 2009 @ 02:56 AM
I'm also trying to run applications on 1.9.1. I'm not very familiar with rails internal structure, but I'm using it in few applications and i want to migrate them to ruby 1.9 to benefit from speed and memory efficiency (not talking about new Ruby features I want to use in future Rails projects).
But I can't!
I've tried all the patches and fixes I could find, but they are not working. I'm using Mysql for DB and Haml for templates and I get "incompatible character encodings: UTF-8 and ASCII-8BIT" when I try to render model attribute with Russian letters. Other UTF-8 strings are okay.
What additional information should I provide to help fixing this issue?
-
trevor December 11th, 2009 @ 07:57 PM
+1 Rails 2.3.5 + Hector patch: Working.
solved my problem with render partial and μm.
-
James Conroy-Finn December 17th, 2009 @ 12:40 PM
@Jakub Instructions on patching pg to return UTF-8 strings are here: http://gist.github.com/215955 (the diff is http://gist.github.com/215956)
-
Andrew Grim December 21st, 2009 @ 08:15 PM
Hector's patch works in the case where your default encoding is UTF-8, but doesn't respect the encoding specified by template itself. Using the latest tests in rails I was able to achieve both with this patch. It only affects ERB, but I believe that is where the bug lies anyway. ERB#src will always return strings encoded as either ASCII or ASCII-8BIT, regardless of both your default encoding and the encoding specified by the ERB string.
This doesn't appear to be an issue with rails 3 as Erubis is used by default, and the bug seems to be ERB specific.
Attached is a patch for 2-3-stable and also a little script that demonstrates the issue (for fun, change the script's encoding to ASCII-8BIT).
-
Andreas Haller January 21st, 2010 @ 07:04 PM
ERB#src will always return strings encoded as either ASCII or ASCII-8BIT, regardless of both your default encoding and the encoding specified by the ERB string.
Is there a bug about this on ruby-lang.org?
Erb#src seems to behave strange, but rendering with Erb seems to just work.
At least on ruby 1.9.2dev (2010-01-22 trunk 26370) [i386-darwin9.8.0]# encoding: UTF-8 require 'erb' template = ERB.new("This is IntéraΫiὉnäl Pöחyß") puts template.src.encoding # US-ASCII # This is not expected… puts template.result # This is IntéraΫiὉnäl Pöחyß # … but it just works. puts template.result.encoding # UTF-8 # This is just works, doesn't it?
-
Deleted User February 3rd, 2010 @ 07:26 AM
- Assigned user cleared.
I'm still having issues with UTF.
With this patches:
- mysql.rb - fix_renderable.rb - fix_params.rbHaving troubles when trying to POST russian characters to controller.
-
kdgundermann February 23rd, 2010 @ 04:39 PM
- Tag set to encoding, utf8
-
Marcello Barnaba March 21st, 2010 @ 12:15 PM
Hello,
here is my monkey patch (hack? :-) to fix this issue on current Rails 2.3.5 apps on 1.9.1, that doesn't involve copy-pasting code from ActionView. It is also available as a Gist on GitHub.
# Rails 2.3.5, Ruby 1.9. ERB returns templates with an ASCII-8BIT encoding, unless they contain # an unicode character, and when you render a partial with unicode chars into a layout without, # the infamous "incompatible character encodings: ASCII-8BIT and UTF-8" error comes out. # # This module monkey-patches module_eval into the ActionView::Base::CompiledTemplates module to # convert the first argument encoding to UTF-8, if needed. # # Put it into lib/patches/compiled_templates.rb and require it into the config.after_initialize # block of your environment.rb. # # LH ticket x-reference: https://rails.lighthouseapp.com/projects/8994/tickets/2188 # # - vjt@openssl.it # module Patches module CompiledTemplates def self.extended(base) base.metaclass.alias_method_chain(:module_eval, :utf8) end def module_eval_with_utf8(*args, &block) if args.first.respond_to?(:encoding) && args.first.encoding != Encoding::UTF_8 args.first.force_encoding(Encoding::UTF_8) end module_eval_without_utf8(*args, &block) end end begin RUBY_VERSION.to_f >= 1.9 && ActionView::Base::CompiledTemplates.method(:module_eval_with_utf8) rescue NameError ActionView::Base::CompiledTemplates.extend Patches::CompiledTemplates end end
Tested on 1.9.1-p378 and a big Rails app with unicode characters in templates :-).
-
Ivan Ukhov April 7th, 2010 @ 12:55 AM
Here is my solution for HAML (http://gist.github.com/358275):
module Haml class Buffer class UTF8String < String def << text; super text.toutf8; end end alias original_initialize initialize def initialize *args original_initialize *args @buffer = UTF8String.new end end end
-
Alberto Fernández Capel April 9th, 2010 @ 01:39 AM
There seems to be a problem when returning an UTF8 string from an erb tag, like this
<%= "Hasta mañana" %>
I traced the problem to action_view/template_handlers/erb.rb. There, ERB.new always return a ASCII-8BIT string when asked for its ruby source. When the view concat this string with the UTF8 string from the tag you get the infamous error.
Attached is a patch with a failing test case and a possible solution. Hope it helps!
-
Cezary Baginski April 25th, 2010 @ 12:15 AM
Changeset with workaround for ERB in an encoding-friendly way.
- rebased to 2-3-stable
- uses encoding comment handling from Andrew Grim's patch
- introduces new concept for handling encodings from non-rails sources (external_encode!)
- test cases
- ActionPack works with -Ku, -Ks, -Ke, -Kn and without -K (as long as #4466 is applied also)
Comments, feedback, questions are more than welcome ;)
- for Haml, MySQL, db, rack encoding issues, see other tickets (I might provide a summary soon) - this patch fixes ONLY the ERB handler!
Thanks to everyone, who helped nail this issue.
-
Jeremy Kemper April 25th, 2010 @ 01:29 AM
- Assigned user set to Jeremy Kemper
Great, working through this. I just backported master changes to 2.3 so I'll rebase it again.
New development needs to start in master and move back to 2.3, also. Can't have a solution on 2.3 but none on 3.0.
-
Cezary Baginski April 25th, 2010 @ 03:38 AM
I'll be glad to switch efforts to 3.0 if this patch turns out to be ok.
-
Cezary Baginski April 27th, 2010 @ 04:29 PM
- Tag changed from encoding, utf8 to encoding, patch, utf8
Patch for Rails 3.0
activesupport:
- rewrote the external_encode! function from scratch
- added more test cases
actionpack:
- added test for line numbering (rendering errors)
- got erb to work with all Ruby options (-Ks, -Ke, -Ku, -Kn, normal, us-ascii)
- tries to play nice with any encoding settings or templates without requiring magic comments
- transcode internally to utf-8 because Ruby's concat cannot really transcode and will fail anyway
- slight cleanup in render encoding test cases
I'll backport this to 2-3-stable (instead of the previous patch) if everything is ok.
-
Yaroslav Markin April 28th, 2010 @ 11:39 PM
Is there any chance we can have a Rails::Configuration key like
... config.action_view.encoding = "utf8" ...
to skip on defining encoding in each and every template? Would be really handy IMO.
-
Cezary Baginski April 29th, 2010 @ 09:45 AM
YES! By all means!
If you have no magic comments, Ruby's own default encoding: External.default_external is assumed.
You can set this inside your application with:
Encoding.default_external = Encoding::UTF_8 if RUBY_VERSION > "1.9"
If for some reason you want to set it outside the application, you can:
- Change it in the command line or shebang line of the server you
are running using Ruby's -E or -K option, e.g:
# /usr/bin/ruby -Ku
- Use environment variables (useful if your production has a
non-utf8 locale, like LANG="C"):
LC_CTYPE=en_US.UTF-8 LANG=en_US.UTF-8 start_my_server
- Set UTF-8 for all Ruby applications, by setting this for your
shell:
export RUBYOPT=-Ku
- Change it in the command line or shebang line of the server you
are running using Ruby's -E or -K option, e.g:
-
The_Lord April 29th, 2010 @ 11:46 AM
You could also wait for Rails 3.0 or start having fun with the current beta. In 3.0.0.beta3 there already is such a line in the application.rb file:
"config.encoding = "utf-8"
Works great :)
-
Cezary Baginski May 10th, 2010 @ 09:03 PM
- Assigned user changed from Jeremy Kemper to Cezary Baginski
I did my homework on m17n in Ruby and talked a lot with Yehuda. I'll redo the patches from scratch, since I the above aren't as they should be.
I'll probably open a new ticket for Rails 3.0 once I have a proper solution for 2.3, but I will try to keep the patches as identical as possible.
If I find any other related issues, but not ERB specific, I'll open new tickets for patches.
-
Cezary Baginski May 12th, 2010 @ 01:26 PM
- Assigned user changed from Cezary Baginski to Jeremy Kemper
New ticket for the Rails 3 version of the patch: #4582
The following patch is rebased to 2-3-stable and hopefully solves all the Erb encoding issues with Ruby 1.9 and this ticket can be closed.
Test cases pass with 1.9 using -Ks, -Ku, -Kn, -Ke and 1.8.
For non Rails 2-3.X Erb specific issues (Haml, DB, Rack, Rails 3), please find existing tickets or create new ones.
Possible problems with the patch:
-
adds a convenience method to String which may look like overkill, but it is used twice and may be useful for other templating engines. Rails encoding support is an ongoing issue anyway and similar functionality will probably be required anyway.
-
line numbering in errors may need better coverage
-
requiring the contents of all templates to already match the internal encoding might be a cleaner solution to this kind of problem, but would be less flexible
-
test cases may need improvements
-
Jonas Nicklas May 12th, 2010 @ 02:06 PM
There is a method called
external_encode
which is documented as "Encode to internal encoding". That seems pretty strange, but I can't say I understand much of what that method does, so maybe it's correct? -
Cezary Baginski May 12th, 2010 @ 03:03 PM
- Tag changed from encoding, patch, utf8 to encoding, erb, patch, utf8
Ruby 1.9 has Encoding.default_external and Encoding.default_internal and when the second is nil, the first is used. So encoding to 'internal' is usually the same as encoding to 'external' in most ruby environments.
A method name to say what it does would be:
encode-from-given-param-or-default_external-to-default_internal
It works like Ruby's "encode", except it tries to handle additional cases with binary (ASCII-8BIT) strings, and catching things encode(dst,src) would not.
Maybe it should be called encode_external, encode_to_internal, etc?
Any suggestions for better names, implementations, documentation are welcome.
-
Rizwan Reza May 16th, 2010 @ 02:41 AM
- Tag changed from encoding, erb, patch, utf8 to bugmash, encoding, erb, patch, utf8
-
Stefano Diem June 13th, 2010 @ 06:09 AM
This error also presents itself under rails 3.0.0.beta3 and rails 3.0.0.beta4.
Since Haml uses Erb::Utils to escape strings, overriding html_escape in application.rb worked for me with both ERB and Haml:module ERB::Util def html_escape(s) s = s.to_s.force_encoding("utf-8") if s.html_safe? s else s.gsub(/[&"><]/) { |special| HTML_ESCAPE[special] }.html_safe end end end
(Not trying to verify if it is 1.9 or do test cases, just putting the simplest and comprehensive way i got to make it work so maybe it can help someone do a proper patch)
-
Damien MATHIEU September 28th, 2010 @ 12:37 PM
I see this same problem in rails3.
Stefano's solution solves it. There's one problem with it though. It won't work if the string is frozen. -
wout September 29th, 2010 @ 12:09 PM
Stefano's solution worked for me but I had to rework it a little to get it working with frozen strings:
module ERB::Util def html_escape(s) frozen = s.frozen? s = s.dup if frozen s = s.to_s.force_encoding("utf-8") s = s.gsub(/[&"><]/) { |special| HTML_ESCAPE[special] }.html_safe unless s.html_safe? s.freeze if frozen s end end
Not the nicest solution but it fixes my production app.
-
wout October 1st, 2010 @ 08:05 PM
The previous workaround worked ok for the views, but it's more of a patch than a solution. The only thing that finally worked for me was installing the ruby-mysql gem:
gem 'ruby-mysql'
Finally, make sure you set the encoding of your database to UTF-8. Otherwise you will keep getting errors anyway.
-
Ryan Bigg October 9th, 2010 @ 10:02 PM
- Tag cleared.
- Importance changed from to Low
Automatic cleanup of spam.
-
joost baaij February 22nd, 2011 @ 10:45 AM
FWIW, this gist contains all patches for Rails 2.3 so this problem goes away.
Note that it doesn't patch the mysql gem since you should use the mysql2 gem on Ruby 1.9 anyway. -
Andrew Selder April 27th, 2011 @ 08:12 PM
I've updated the normalize parameters portion of joost's patch.
This checks to make sure that the parameter is acutally valid in the UTF-8 encoding. If it's not, it tries to interpret the parameter as ISO-8859-1 and then transcode it to UTF-8.
We're getting lots of query parameters like:
* URL : http://www.blah.com?&category=concerts&headliner=Charles%20Dub%E9abcwhere there is a 0xE9 byte encoded in the URL. Legal ISO-8859-1, but not UTF-8
Create your profile
Help contribute to this project by taking a few moments to create your personal profile. Create your profile »
<h2 style="font-size: 14px">Tickets have moved to Github</h2>
The new ticket tracker is available at <a href="https://github.com/rails/rails/issues">https://github.com/rails/rails/issues</a>
People watching this ticket
- Adam S
- Akira Matsuda
- Alberto Fernández Capel
- Andrew White
- Ben
- Betelgeuse
- Brendan Schwartz
- Cezary Baginski
- Claudio Poli
- Damien MATHIEU
- Falk Pauser
- George Deglin
- Hector E. Gomez Morales
- James Conroy-Finn
- James Healy
- James Tippett
- Jeff Kreeftmeijer
- Jeremy Kemper
- Jérôme
- johno
- Jonas Nicklas
- Juanma Cervera
- kdgundermann
- Kevin Valdek
- Levin Alexander
- Marcello Barnaba
- Martin Schuerrer
- Mauricio Eduardo Szabo
- Mezza
- Michael H Buselli
- Niels Ganser
- Phil Ross
- qoobaa
- Rasmus Rønn Nielsen
- Robert Bjarnason
- RomD
- Rudolf Gavlas
- Ryan Bigg
- sadtuna
- Santiago Pastorino
- Serge Balyuk
- siyang1982
- Sven Fuchs
- The_Lord
- Toby Matejovsky
- tsechingho
- Vitali Kulikou
- Yaroslav Markin
- yury
Attachments
Referenced by
- 3878 8bit characters in templates Any chance this is a duplicate of #2188?
- 4582 Multiple encoding support for Erubis This just #2188 for Rails 3.0
- 4582 Multiple encoding support for Erubis Basically a port of #2188 to Rails 3, but I want to make ...
- 4582 Multiple encoding support for Erubis Basically a port of #2188 to Rails 3, but I want to make ...
- 4807 ERROR Encoding::UndefinedConversionError: "\xC3" from ASCII-8BIT to UTF-8 Maybe linked with #3947 and #2188 ?
- 3947 incompatible character encodings: UTF-8 and ASCII-8BIT for Rails 3 This might be a duplicate of #2188, but I didn't have any...
- 2476 ASCII-8BIT encoding of query results in rails 2.3.2 and ruby 1.9.1 From #2188:
- 2476 ASCII-8BIT encoding of query results in rails 2.3.2 and ruby 1.9.1 This has been reported in #2188 and in the rails talk gro...
- 2476 ASCII-8BIT encoding of query results in rails 2.3.2 and ruby 1.9.1 Again like in #2188 rails is not the culprit here, the on...
- 3878 8bit characters in templates Marking this as a duplicate of #2188. Please take it up t...