This project is archived and is in readonly mode.
Ruby 1.9 and ActiveSupport
Reported by Dimitrij Denissenko | May 9th, 2009 @ 08:45 PM | in 2.x
Hi!
I came across the following error in ActiveSupport (in combination with Ruby 1.9)
@@@ $ irb1.9 -r activesupport irb(main):001:0> "\xAA".blank? ArgumentError: invalid byte sequence in UTF-8
from /usr/local/lib/ruby1.9/gems/1.9.1/gems/activesupport-2.3.2/lib/active_support/core_ext/blank.rb:50:in `=~'
from /usr/local/lib/ruby1.9/gems/1.9.1/gems/activesupport-2.3.2/lib/active_support/core_ext/blank.rb:50:in `!~'
from /usr/local/lib/ruby1.9/gems/1.9.1/gems/activesupport-2.3.2/lib/active_support/core_ext/blank.rb:50:in `blank?'
from (irb):1
from /usr/local/bin/irb1.9:12:in `<main>'
Some Background: in my case, my application fails in ActionController::Response:
@@@ module ActionController # :nodoc:
class Response < Rack::Response
def etag=(etag)
if etag.blank?
headers.delete('ETag')
else
headers['ETag'] = %("#{Digest::MD5.hexdigest(ActiveSupport::Cache.expand_cache_key(etag))}")
end
end
end
end
The generated etag contains "\xAA" and fails on etag.blank?
Any clue how to deal with that? Any idea for a workaround?
Comments and changes to this ticket
-
Scott July 14th, 2009 @ 03:03 AM
Hi Dimitrij,
This issue just bit me on Ruby 1.9.1 as well. For me, it occurred on an action involving a CSV data export.
For the time being, I just wrapped the method body in a begin / rescue block and called headers.delete('ETag') if the test failed. As the Etag is not critical to the functionality of the application, it seemed a decent (if hasty) response.
Please keep me posted if you've found a better workaround, or if this issue has been resolved.
Thanks,
C. Scott Andreas
Developer, Sunago.org -
Scott July 14th, 2009 @ 08:01 AM
Hello,
I've taken some time to better understand this behavior, patch the underlying issue, and prepare a test demonstrating the bug and verifying the resolution.
In my case, the application was also failing in ActionController::Response. The data rendered in my action contained a non-UTF8 character that, in the rendering process, is coerced to UTF-8 (likely from ASCII_8BIT) - this behavior is tested in part in actionpack/test/new_base/render_test.rb:
def test_render_utf8_template_with_magic_comment with_external_encoding Encoding::ASCII_8BIT do result = @view.render(:file => "test/utf8_magic.html.erb", :layouts => "layouts/yield") assert_equal "Русский текст\nUTF-8\nUTF-8\nUTF-8\n", result assert_equal Encoding::UTF_8, result.encoding end end
However, in the event that non-UTF8 data is coerced into UTF-8 and String.blank? is called on it, an ArgumentError is raised because the current implementation (a regular expression) throws an exception when it receives a string with invalid encoding. So when the rendered response is evaluated to prepare etags in ActionController::Response, the application breaks by attempting to test for blank-ness on the document body.
Here are steps to reproduce the bug, both in a Rails 2.3.2 app on Ruby 1.9.1 in a console, and in IRB by requiring ActiveSupport: http://gist.github.com/146722
See the backtrace here: http://u.phoreo.com/lu.html
While it is not surprising that an error could result at some level when working with data of multiple and perhaps invalid character encodings, the point at which affected applications fail (in ActionController::Response) is a problem. Should an error occur due to character encoding, the request should at least not die deep inside ActionController.
More specifically, the current behavior of String.blank? does not conform to the statement atop active_support/core_ext/object/blank.rb, which states: An object is blank if it's false, empty, or a whitespace string. This method should not throw an exception when called upon a String object, regardless of encoding. The modified test included in this patch demonstrates the failure by supplying a string, converting it to UTF-8, and attempting to call .blank?.
The behavior of the patched version in Ruby 1.9 is as follows: if a string of invalid encoding is supplied and a regex test would result in fireworks, the method falls back to String.strip.size to test for blankness.
This seems to be a faithful implementation of the method that is compatible with both 1.8.x and 1.9.x while remaining minimally-invasive.
Apologies in advance for writing so much about such a simple bug and a one-line patch! Its obscurity seemed to warrant a little explanation.
I'd love to hear your feedback, and of any plans to include this or a similar patch in a future release.
Regards,
Scott Andreas (@cscotta)
-
Scott July 14th, 2009 @ 08:05 AM
Got an S3 error when attempting to download the patch attached to this ticket.
I've posted it here in case this doesn't fix itself: http://u.phoreo.com/01.diff
-
Michael Koziarski July 15th, 2009 @ 12:48 AM
- Assigned user set to Jeremy Kemper
Jeremy's the 1.9 guy
-
James Healy July 19th, 2009 @ 01:22 PM
+1 to this patch.
An alternative might be to output a warning to the log file if a template contains invalid UTF-8 bytes, then use iconv to strip the offending bytes out. That way the template will always be UTF-8 and will hopefully avoid other similar issues like this blank?() one.
-
Jeremy Kemper August 2nd, 2009 @ 04:32 AM
- State changed from new to wontfix
Changing blank? masks the underlying error instead of fixing the root cause. Why do you have a string with invalid encoding?
-
pederbl (at jobstar) January 11th, 2011 @ 01:20 PM
- Tag changed from 2-3-stable, activesupport, ruby1.9, ruby19 to 2-3-stable, activesupport, rails3, ruby1.9, ruby19
- Importance changed from to
Strings with invalid encoding is supported by Ruby. So it seems natural that calling blank? on strings with invalid encoding should be supported by Rails.
There is another way to figure out if strings are invalid, e.g. valid_encoding?
What is the benefit of making blank? fail on strings with invalid encoding?
-
csnk May 18th, 2011 @ 08:27 AM
We are the professional underwear manufacturer, underwear supplier, underwear factory, custom underwear.
Create your profile
Help contribute to this project by taking a few moments to create your personal profile. Create your profile »
<h2 style="font-size: 14px">Tickets have moved to Github</h2>
The new ticket tracker is available at <a href="https://github.com/rails/rails/issues">https://github.com/rails/rails/issues</a>
People watching this ticket
Attachments
Referenced by
- 4256 Response should be encoded in charset ...as the call to blank? assumes the string is UTF-8 and ...