This project is archived and is in readonly mode.
to_json does not escape unicode characters
Reported by Ryan Sonnek | October 7th, 2009 @ 10:27 PM
The current JSON encode method does not properly escape non-ascii characters.
'bad characters'.concat(16).to_json => ""bad characters\020""
Should be:
=> ""bad characters\u0010""
The ruby json gem works correctly and an acceptable work around would be to have the ActiveSupport::JSON.encode method delegate to the ActiveSupport::JSON.backend (which can be configured to use the ruby json gem)
tested in rails 2.3.4
Comments and changes to this ticket
-
Shannon -jj Behrens December 11th, 2009 @ 01:34 AM
I think I stumbled across the same issue.
$ ./script/console Loading development environment (Rails 2.3.5)
ActiveSupport::JSON.decode("[1]") => [1] ActiveSupport::JSON.decode("[1]") => "[1]"
I'm executing the same code twice, and getting two different answers. It turns out that the second one has a BOM (Unicode byte order mark) before the "[". You can't see it. It probably won't make it through. When I paste it into Vim, I see:
ActiveSupport::JSON.decode("[1]")
I'm sorry, I don't know how to represent that as a String literal.
Anyway, if the BOM is there, JSON.decode gives you a String instead of a list. I think it should either ignore the BOM completely or raise an error. I'd love to get an error if there is weird crud like that.
-
Shannon -jj Behrens December 11th, 2009 @ 01:36 AM
Ok, wow, let me try that again:
ActiveSupport::JSON.decode("[1]") produces [1] # A list.
ActiveSupport::JSON.decode("[1]") produces "[1]" # A string.
The second one has a BOM before the "[". In Vim it shows it as (less than)feff(greater than).
-
Dwayne Litzenberger (Infonium Inc.) December 22nd, 2009 @ 08:32 PM
- Tag changed from json, unicode to json, patch, unicode
I've attached a patch that escapes the control characters \x00-\x1f, per RFC 4627.
-
Dwayne Litzenberger (Infonium Inc.) December 22nd, 2009 @ 09:32 PM
The previous patch was against 2-3-stable. This patch is against the master branch.
-
Repository December 23rd, 2009 @ 07:46 PM
- State changed from new to committed
(from [a9002056761a481589852d6e8680f752a5b823b7]) Fix ActiveSupport::JSON encoding of control characters [\x00-\x1f]
According to RFC 4627, only the following Unicode code points are
allowed unescaped in JSON:unescaped = %x20-21 / %x23-5B / %x5D-10FFFF
However, ActiveSupport::JSON did not escape the range %x00-1f. This caused
parse errors when trying to decode the resulting output.[#3345 state:committed]
Signed-off-by: Jeremy Kemper jeremy@bitsweat.net
http://github.com/rails/rails/commit/a9002056761a481589852d6e8680f7... -
Repository December 23rd, 2009 @ 07:46 PM
(from [808cad2bb4f1534a66e20fb5bfedd09e3678e278]) Fix ActiveSupport::JSON encoding of control characters [\x00-\x1f]
According to RFC 4627, only the following Unicode code points are
allowed unescaped in JSON:unescaped = %x20-21 / %x23-5B / %x5D-10FFFF
However, ActiveSupport::JSON did not escape the range %x00-1f. This caused
parse errors when trying to decode the resulting output.[#3345 state:committed]
Signed-off-by: Jeremy Kemper jeremy@bitsweat.net
http://github.com/rails/rails/commit/808cad2bb4f1534a66e20fb5bfedd0...
Create your profile
Help contribute to this project by taking a few moments to create your personal profile. Create your profile »
<h2 style="font-size: 14px">Tickets have moved to Github</h2>
The new ticket tracker is available at <a href="https://github.com/rails/rails/issues">https://github.com/rails/rails/issues</a>
People watching this ticket
Attachments
Referenced by
- 3345 to_json does not escape unicode characters [#3345 state:committed]
- 3345 to_json does not escape unicode characters [#3345 state:committed]