This project is archived and is in readonly mode.
Incorrect behavior of truncate on utf-8 strings
Reported by tulskiy | March 1st, 2010 @ 07:38 AM
I have a problem with truncate on Russian strings. For example if s is "Словарь" then
s = truncate(product.description, :length => 9)
returns correct string with bytes
208 161 208 187 208 190 46 46 46
But if I truncate to 10 characters, it returns the following
byte sequence
208 161 208 187 208 190 208 46 46 46
where 208 before last three dots is a start of next UTF-8
character. So it looks like truncate cuts based on byte length and
rounds up and gets wrong bytes.
This results in 'invalid byte sequence in UTF-8' exception.
I'm using Rails 2.3.5 and Ruby 1.9.1 on Ubuntu 9.10.
Comments and changes to this ticket
-
tulskiy March 1st, 2010 @ 07:39 AM
Forgot to add, I'm using Rails 2.3.5 and Ruby 1.9.1 on Ubuntu 9.10.
-
Jeremy Kemper March 1st, 2010 @ 07:32 PM
- Assigned user set to Jeremy Kemper
- State changed from new to invalid
What is product.description.encoding? I bet it's ascii not utf-8. You need to update your mysql/pg/sqlite driver to a newer version that supports 1.9 string encodings.
Create your profile
Help contribute to this project by taking a few moments to create your personal profile. Create your profile »
<h2 style="font-size: 14px">Tickets have moved to Github</h2>
The new ticket tracker is available at <a href="https://github.com/rails/rails/issues">https://github.com/rails/rails/issues</a>