This project is archived and is in readonly mode.

#6247 new
Laurent Farcy

ActiveSupport::JSON.decode fails on long unicode sequence with default YAML-based parser

Reported by Laurent Farcy | January 3rd, 2011 @ 01:13 PM

Here's a reply received from the Twitter API (users/show call) that the YAML-based JSON decoder is unable to parse.

{"profile_background_tile":true,"time_zone":"Quito","friends_count":55,"favourites_count":1,"description":"","status":{"in_reply_to_status_id_str":null,"place":null,"in_reply_to_user_id":null,"text":"RT @oizo3000: .\u00aa)\u0004.k\u00f6\u00ee\u02c6_\u00d9\u00f1\u00ba#\u00eb\u00d9\u00ff\u0000vm\u00df\u017d\u00b0\u007D\u0178\u0018\u000b\u00eb\u00b9\u00df\u00fe!w\u0192\u00ec\u00f8\u00d6f\u00a4\u00a58\u00b2\u00a8\u009d8\u00e9\"\u00fdP\u201c\u0012\u00e8cC\u00aaZw\u00d5\u00b4\nb\u001d\u00ce\u2014\u00b3\u007D\u00a3K\u00d6\u00c7)\u00be\u00b18\u0015\u001a\u00a1:\u00f8\u00fc\u00e1\u00e1\u0178\u00ac\u0000\u00a2\u0013\u0000\u0018zi\u00d7\u00a6\u00d0\u0006\u2022\u00e5\u00e1\u00f8\u00c0\u0000f]o<\u2020\u00d1\u00aa\u017d\u00c2\u00001m\u00a9\u00b7T\u2014P\u00a4 ...","contributors":null,"retweet_count":38,"in_reply_to_user_id_str":null,"retweeted_status":{"in_reply_to_status_id_str":null,"place":null,"in_reply_to_user_id":null,"text":".\u00aa)\u0004.k\u00f6\u00ee\u02c6_\u00d9\u00f1\u00ba#\u00eb\u00d9\u00ff\u0000vm\u00df\u017d\u00b0\u007D\u0178\u0018\u000b\u00eb\u00b9\u00df\u00fe!w\u0192\u00ec\u00f8\u00d6f\u00a4\u00a58\u00b2\u00a8\u009d8\u00e9\"\u00fdP\u201c\u0012\u00e8cC\u00aaZw\u00d5\u00b4\nb\u001d\u00ce\u2014\u00b3\u007D\u00a3K\u00d6\u00c7)\u00be\u00b18\u0015\u001a\u00a1:\u00f8\u00fc\u00e1\u00e1\u0178\u00ac\u0000\u00a2\u0013\u0000\u0018zi\u00d7\u00a6\u00d0\u0006\u2022\u00e5\u00e1\u00f8\u00c0\u0000f]o<\u2020\u00d1\u00aa\u017d\u00c2\u00001m\u00a9\u00b7T\u2014P\u00a4,ox\u0000\u00f9\t7'&\u00c2Y\u0019\u00ed\u00aa\u00bd \tkl\u00ad","contributors":null,"retweet_count":38,"in_reply_to_user_id_str":null,"retweeted":false,"id_str":"18078479593504768","source":"web","truncated":false,"geo":null,"in_reply_to_status_id":null,"favorited":false,"id":18078479593504768,"coordinates":null,"in_reply_to_screen_name":null,"created_at":"Thu Dec 23 23:00:20 +0000 2010"},"retweeted":false,"id_str":"18108105204170753","source":"web","truncated":true,"geo":null,"in_reply_to_status_id":null,"favorited":false,"id":18108105204170753,"coordinates":null,"in_reply_to_screen_name":null,"created_at":"Fri Dec 24 00:58:03 +0000 2010"},"verified":false,"profile_link_color":"0084B4","location":"","follow_request_sent":false,"profile_sidebar_border_color":"BDDCAD","id_str":"18656867","show_all_inline_media":false,"geo_enabled":false,"url":"http:\/\/fullyfitted.blogspot.com\/","profile_use_background_image":true,"lang":"en","profile_background_color":"9AE4E8","profile_image_url":"http:\/\/a0.twimg.com\/profile_images\/69874360\/xxx_logobar_web_normal.jpg","is_translator":false,"listed_count":55,"profile_background_image_url":"http:\/\/a2.twimg.com\/profile_background_images\/4349971\/airhorn.gif","followers_count":1602,"protected":false,"contributors_enabled":false,"notifications":false,"screen_name":"alexxxchange","name":"alexxxchange","statuses_count":737,"following":false,"profile_text_color":"333333","id":18656867,"utc_offset":-18000,"created_at":"Tue Jan 06 02:00:44 +0000 2009","profile_sidebar_fill_color":"DDFFCC"}

Unfortunately, I have not managed to pinpoint the exact sequence that the parser fails upon but the culprit seems to be the text attribute with its multiple unicode characters.

Here's the exception that I get (after modifying the original code base to get the root cause)

ArgumentError: syntax error on line 0, col 268: `{"profile_background_tile": true, "time_zone": "Quito", "friends_count": 55, "favourites_count": 1, "description": "", "status": {"in_reply_to_status_id_str": null, "place": null, "in_reply_to_user_id": null, "text": "RT @oizo3000: .ª).köîˆ_Ùñº#ëÙÿ'
        from /usr/local/lib/ruby/1.8/yaml.rb:133:in `load'
        from /usr/local/lib/ruby/1.8/yaml.rb:133:in `load'
        from /Volumes/Data/Users/lfarcy/workspace/bugfix/vendor/rails/activesupport/lib/active_support/json/backends/yaml.rb:12:in `decode'
        from /Volumes/Data/Users/lfarcy/workspace/bugfix/vendor/rails/activesupport/lib/active_support/json/decoding.rb:11:in `__send__'
        from /Volumes/Data/Users/lfarcy/workspace/bugfix/vendor/rails/activesupport/lib/active_support/json/decoding.rb:11:in `decode'
        from (irb):4

The same JSON chunk works with the JSON gem. To work around this issue, I had to enforce JSONGem as the default JSON backend.

Since I'm still running Rails 2.3.5, I tried a fix on convert_json_to_yaml that was introduced in 2.3.6 (see #2831). But it did not fix the issue hereby.

Comments and changes to this ticket

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile »

<h2 style="font-size: 14px">Tickets have moved to Github</h2>

The new ticket tracker is available at <a href="https://github.com/rails/rails/issues">https://github.com/rails/rails/issues</a>

People watching this ticket

Pages