This project is archived and is in readonly mode.

#6559 ✓resolved
naruse

ActiveRecord fails with both non ASCII string and binary on SQLite3Adapter

Reported by naruse | March 11th, 2011 @ 04:59 AM

Summary

ActiveRecord (and Arel) generates a SQL statement string from arguments.
When arguments consists non ASCII string (for example UTF-8) and binary,
it raises Encoding::CompatibilityError.

How to reproduce

For example create an app by following commands:

rails new foo
cd foo
rails scaffold bar name:string data:binary
rails server

And show http://127.0.0.1:3000/bar/new and input two fields some UTF-8 string (non ASCII),
and push [Create Bar] button.
Then it occurs error.

Why

ActiveRecord (Arel) generates an SQL statement from those arguments (UTF-8 string and binary).
So it concats UTF-8 string and binary at somewhere, and raises Encoding::CompatibilityError.
(on arel 2.0.8 it is lib/arel/visitors/to_sql.rb:70:in `join')

How to fix

To fix this, the simple answer is force the same encoding to both string, but how and what?
This question is resulted an another question by a policy "a string should be valid":
"What is the correct encoding of a resulted SQL statement string." There seems two options how it should be.

UTF-8

If a resulted SQL statement string should be UTF-8, binary should be escaped.
I wrote a patch for SQL3Adapter:

% diff -u lib/active_record/connection_adapters/sqlite3_adapter.rb.orig lib/active_record/connection_adapters/sqlite3_adapter.rb
--- lib/active_record/connection_adapters/sqlite3_adapter.rb.orig 2011-03-11 11:10:39.000000000 +0900
+++ lib/active_record/connection_adapters/sqlite3_adapter.rb   2011-03-11 11:11:10.000000000 +0900
@@ -38,6 +38,15 @@
   module ConnectionAdapters #:nodoc:
     class SQLite3Adapter < SQLiteAdapter # :nodoc:
 
+      def quote(value, column = nil)
+        if value.kind_of?(String) && column && column.type == :binary && column.class.respond_to?(:string_to_binary)
+          s = column.class.string_to_binary(value).unpack("H*")[0]
+          "x'#{s}'"
+        else
+          super
+        end
+      end
+
       # Returns the current database encoding format as a string, eg: 'UTF-8'
       def encoding
         if @connection.respond_to?(:encoding)

ASCII-8BIT (BINARY)

If it should be binary, all parts should have ASCII-8BIT encoding.
So patch should be for Arel and it is following:

% diff -u lib/arel/visitors/to_sql.rb.orig lib/arel/visitors/to_sql.rb
--- lib/arel/visitors/to_sql.rb.orig    2011-03-11 13:48:57.000000000 +0900
+++ lib/arel/visitors/to_sql.rb 2011-03-11 13:58:19.000000000 +0900
@@ -68,7 +68,9 @@
 
       def visit_Arel_Nodes_Values o
         "VALUES (#{o.expressions.zip(o.columns).map { |value, column|
-          quote(value, column && column.column)
+          str = quote(value, column && column.column)
+          str.force_encoding(Encoding::ASCII_8BIT) if str.respond_to?(:force_encoding)
+          str
         }.join ', '})"
       end

Conclusion

Please apply either of above patch.

Comments and changes to this ticket

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile »

<h2 style="font-size: 14px">Tickets have moved to Github</h2>

The new ticket tracker is available at <a href="https://github.com/rails/rails/issues">https://github.com/rails/rails/issues</a>

Referenced by

Pages