This project is archived and is in readonly mode.

#2628 ✓wontfix
Dimitrij Denissenko

Ruby 1.9 and ActiveSupport

Reported by Dimitrij Denissenko | May 9th, 2009 @ 08:45 PM | in 2.x

Hi!

I came across the following error in ActiveSupport (in combination with Ruby 1.9)

@@@ $ irb1.9 -r activesupport irb(main):001:0> "\xAA".blank? ArgumentError: invalid byte sequence in UTF-8

from /usr/local/lib/ruby1.9/gems/1.9.1/gems/activesupport-2.3.2/lib/active_support/core_ext/blank.rb:50:in `=~'
from /usr/local/lib/ruby1.9/gems/1.9.1/gems/activesupport-2.3.2/lib/active_support/core_ext/blank.rb:50:in `!~'
from /usr/local/lib/ruby1.9/gems/1.9.1/gems/activesupport-2.3.2/lib/active_support/core_ext/blank.rb:50:in `blank?'
from (irb):1
from /usr/local/bin/irb1.9:12:in `<main>'



Some Background: in my case, my application fails in ActionController::Response:

@@@ module ActionController # :nodoc:
  class Response < Rack::Response

    def etag=(etag)      
      if etag.blank?
        headers.delete('ETag')
      else
        headers['ETag'] = %("#{Digest::MD5.hexdigest(ActiveSupport::Cache.expand_cache_key(etag))}")
      end
    end
  end
end

The generated etag contains "\xAA" and fails on etag.blank?

Any clue how to deal with that? Any idea for a workaround?

Comments and changes to this ticket

  • Scott

    Scott July 14th, 2009 @ 03:03 AM

    Hi Dimitrij,

    This issue just bit me on Ruby 1.9.1 as well. For me, it occurred on an action involving a CSV data export.

    For the time being, I just wrapped the method body in a begin / rescue block and called headers.delete('ETag') if the test failed. As the Etag is not critical to the functionality of the application, it seemed a decent (if hasty) response.

    Please keep me posted if you've found a better workaround, or if this issue has been resolved.

    Thanks,

    C. Scott Andreas
    Developer, Sunago.org

  • Scott

    Scott July 14th, 2009 @ 08:01 AM

    Hello,

    I've taken some time to better understand this behavior, patch the underlying issue, and prepare a test demonstrating the bug and verifying the resolution.

    In my case, the application was also failing in ActionController::Response. The data rendered in my action contained a non-UTF8 character that, in the rendering process, is coerced to UTF-8 (likely from ASCII_8BIT) - this behavior is tested in part in actionpack/test/new_base/render_test.rb:

        def test_render_utf8_template_with_magic_comment
          with_external_encoding Encoding::ASCII_8BIT do
            result = @view.render(:file => "test/utf8_magic.html.erb", :layouts => "layouts/yield")
            assert_equal "Русский текст\nUTF-8\nUTF-8\nUTF-8\n", result
            assert_equal Encoding::UTF_8, result.encoding
          end
        end
    

    However, in the event that non-UTF8 data is coerced into UTF-8 and String.blank? is called on it, an ArgumentError is raised because the current implementation (a regular expression) throws an exception when it receives a string with invalid encoding. So when the rendered response is evaluated to prepare etags in ActionController::Response, the application breaks by attempting to test for blank-ness on the document body.

    Here are steps to reproduce the bug, both in a Rails 2.3.2 app on Ruby 1.9.1 in a console, and in IRB by requiring ActiveSupport: http://gist.github.com/146722

    See the backtrace here: http://u.phoreo.com/lu.html

    While it is not surprising that an error could result at some level when working with data of multiple and perhaps invalid character encodings, the point at which affected applications fail (in ActionController::Response) is a problem. Should an error occur due to character encoding, the request should at least not die deep inside ActionController.

    More specifically, the current behavior of String.blank? does not conform to the statement atop active_support/core_ext/object/blank.rb, which states: An object is blank if it's false, empty, or a whitespace string. This method should not throw an exception when called upon a String object, regardless of encoding. The modified test included in this patch demonstrates the failure by supplying a string, converting it to UTF-8, and attempting to call .blank?.

    The behavior of the patched version in Ruby 1.9 is as follows: if a string of invalid encoding is supplied and a regex test would result in fireworks, the method falls back to String.strip.size to test for blankness.

    This seems to be a faithful implementation of the method that is compatible with both 1.8.x and 1.9.x while remaining minimally-invasive.

    Apologies in advance for writing so much about such a simple bug and a one-line patch! Its obscurity seemed to warrant a little explanation.

    I'd love to hear your feedback, and of any plans to include this or a similar patch in a future release.

    Regards,

    Scott Andreas (@cscotta)

  • Scott

    Scott July 14th, 2009 @ 08:05 AM

    Got an S3 error when attempting to download the patch attached to this ticket.

    I've posted it here in case this doesn't fix itself: http://u.phoreo.com/01.diff

  • Michael Koziarski

    Michael Koziarski July 15th, 2009 @ 12:48 AM

    • Assigned user set to “Jeremy Kemper”

    Jeremy's the 1.9 guy

  • James Healy

    James Healy July 19th, 2009 @ 01:22 PM

    +1 to this patch.

    An alternative might be to output a warning to the log file if a template contains invalid UTF-8 bytes, then use iconv to strip the offending bytes out. That way the template will always be UTF-8 and will hopefully avoid other similar issues like this blank?() one.

  • Jeremy Kemper

    Jeremy Kemper August 2nd, 2009 @ 04:32 AM

    • State changed from “new” to “wontfix”

    Changing blank? masks the underlying error instead of fixing the root cause. Why do you have a string with invalid encoding?

  • pederbl (at jobstar)

    pederbl (at jobstar) January 11th, 2011 @ 01:20 PM

    • Tag changed from 2-3-stable, activesupport, ruby1.9, ruby19 to 2-3-stable, activesupport, rails3, ruby1.9, ruby19
    • Importance changed from “” to “”

    Strings with invalid encoding is supported by Ruby. So it seems natural that calling blank? on strings with invalid encoding should be supported by Rails.

    There is another way to figure out if strings are invalid, e.g. valid_encoding?

    What is the benefit of making blank? fail on strings with invalid encoding?

  • csnk

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile »

<h2 style="font-size: 14px">Tickets have moved to Github</h2>

The new ticket tracker is available at <a href="https://github.com/rails/rails/issues">https://github.com/rails/rails/issues</a>

Attachments

Referenced by

Pages