#4075 Incorrect behavior of truncate on utf-8 strings - Ruby on Rails

Type	To find
responsible:me	tickets assigned to you
tagged:"@high"	tickets tagged @high
milestone:next	tickets in the upcoming milestone
state:invalid	tickets with the state invalid
created:"last week"	tickets created last week
sort:number, importance, updated	tickets sorted by #, importance or updated
Combine keywords for powerful searching.
Use advanced searching »

This project is archived and is in readonly mode.

#4075 ✓invalid

Incorrect behavior of truncate on utf-8 strings

Reported by tulskiy | March 1st, 2010 @ 07:38 AM

I have a problem with truncate on Russian strings. For example if s is "Словарь" then

s = truncate(product.description, :length => 9)

returns correct string with bytes
208 161 208 187 208 190 46 46 46

But if I truncate to 10 characters, it returns the following byte sequence
208 161 208 187 208 190 208 46 46 46
where 208 before last three dots is a start of next UTF-8 character. So it looks like truncate cuts based on byte length and rounds up and gets wrong bytes.

This results in 'invalid byte sequence in UTF-8' exception.

I'm using Rails 2.3.5 and Ruby 1.9.1 on Ubuntu 9.10.

Comments and changes to this ticket

You flagged this item as spam.
tulskiy March 1st, 2010 @ 07:39 AM
Forgot to add, I'm using Rails 2.3.5 and Ruby 1.9.1 on Ubuntu 9.10.
Jeremy Kemper March 1st, 2010 @ 07:32 PM
- Assigned user set to “Jeremy Kemper”
- State changed from “new” to “invalid”
What is product.description.encoding? I bet it's ascii not utf-8. You need to update your mysql/pg/sqlite driver to a newer version that supports 1.9 string encodings.

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile »

<h2 style="font-size: 14px">Tickets have moved to Github</h2>

The new ticket tracker is available at <a href="https://github.com/rails/rails/issues">https://github.com/rails/rails/issues</a>

Shared Ticket Bins (Sort)

↓↑ drag 455 Open Bugs
↓↑ drag 0 Recent Open Tickets
↓↑ drag 87 Open Patches
↓↑ drag 0 Open Doc Patches
↓↑ drag 542 Stale Tickets
↓↑ drag 3 Verified Patches
↓↑ drag 87 Stale Patches
↓↑ drag 0 Yesterday's Tickets
↓↑ drag 0 Two Days Ago
↓↑ drag 0 This Week
↓↑ drag 0 Last Week
↓↑ drag 7 Bugmash
↓↑ drag 0 Bugmash review queue
↓↑ drag 5 Rails 3 High Priority Tickets

Rails Ruby on Rails

Incorrect behavior of truncate on utf-8 strings

Comments and changes to this ticket

tulskiy March 1st, 2010 @ 07:39 AM

Jeremy Kemper March 1st, 2010 @ 07:32 PM

Create your profile

Shared Ticket Bins (Sort)

People watching this ticket

Tags

Pages

Rails Ruby on Rails

Keyword searching

Incorrect behavior of truncate on utf-8 strings

Comments and changes to this ticket

tulskiy March 1st, 2010 @ 07:39 AM

Jeremy Kemper March 1st, 2010 @ 07:32 PM

Create your profile

Shared Ticket Bins (Sort)

People watching this ticket

Tags

Pages