This project is archived and is in readonly mode.
PostgresSQLAdapter#unescape_bytea does not handle "\\"
Reported by Kurt Stephens | February 1st, 2009 @ 04:07 PM | in 2.3.6
"\" is a valid PostgreSQL escape for a literal backslash. This causes very serious data round-trip problems.
The fix is to let the PostgreSQL client lib do the work, rather than scanning for "\\d{3}". It's probably faster, too.
module ActiveRecord module ConnectionAdapters
class PostgreSQLAdapter
# The stock version of this only unescapes_bytea values if
# the raw data contains /\\\d{3}/, which clobbers the
# common escape of '\\'.
def unescape_bytea(value)
PGconn.unescape_bytea(value)
end
end
end end
Comments and changes to this ticket
-
Kurt Stephens February 12th, 2009 @ 01:11 AM
gem list postgres LOCAL GEMS postgres (0.7.9.2008.01.28)
-
Kurt Stephens February 12th, 2009 @ 01:13 AM
gem list --remote postgres REMOTE GEMS postgres (0.7.9.2008.01.28) postgres-pr (0.5.1)
The issue is not with the postgres gem. The hacked-up test for checking if data returned from postgres needs unescaped is wrong and unnecessary.
-
Will Bryant February 16th, 2009 @ 03:25 AM
The reason that I ask is that a lot of the weird stuff we do in postgres was introduced to deal with problems with the gems, and unfortunately it sticks around long after the problems get fixed.
There's a lot of confusion around things like this because there are three related gems, ruby-postgres, postgres, and now pg.
So I'm wondering if ruby-postgres had an issue and we added a workaround back when ruby-postgres was the standard. Even the postgres gem you're using is in the process towards deprecating - current thinking on the mailing list seems to go that we'll move to "pg" only after 2.3.
Anyway, if you upload a patch with a test case for the problem, it's definitely something we'd all like to see fixed. Please test against edge/2.3 RC.
Thanks!
-
Kurt Stephens May 7th, 2009 @ 05:52 AM
The real issue is to let whatever PG client libary Rails is using do the job of quoting and unquoting data. There's no good reason for quoting rules to be handled in ActiveRecord. The client library is either correct or incorrect -- incorrect client libraries should get fixed or disappear.
Unfortunately I don't have resources at the moment to write patches and unit tests but here's a quick snippet that can be added to the existing Postgres tests:
Try to save and load the following ASCII Ruby string as binary data in a postgres bytea column:
"this string contains a literal \\ somewhere inside"
-
CancelProfileIsBroken August 6th, 2009 @ 02:29 PM
- Tag changed from 2.2-stable, activerecord, postgres, postgresql to 2.2-stable, activerecord, bugmash, postgres, postgresql
-
Elise Huard August 9th, 2009 @ 11:13 PM
-1 unable to reproduce on master.
Used native postgresql adapter and postgres (0.7.9.2008.01.28)
-
Matt White August 11th, 2009 @ 10:10 PM
- Tag changed from 2.2-stable, activerecord, bugmash, postgres, postgresql to 2.2-stable, 2.3-stable, activerecord, bugmash, postgres, postgresql
This absolutely IS a problem on Rails 2.3.2 and postgres gem 0.7.9.2008.01.28 (using PostgreSQL 8.3.5). Perhaps it is fixed in edge (haven't checked yet). It seems as though Rails is interpreting double backslashes from Postgres as literal, where each double backslash should in fact be a single escaped backslash. See my script/console output below.
(where "object" is an ActiveRecord object and "field" is a bytea field)
object.field = '\' => "\" puts object.field \ object.save => true object.reload ... puts object.field => \
-
Matt White August 11th, 2009 @ 10:16 PM
Ugh, sorry for the hideous formatting.
object.field = '\'
=> "\"
puts object.field
\
object.save
=> true
object.reload
...
puts object.field
=> \\ -
Jon Jensen August 14th, 2009 @ 01:05 AM
I just repro'd this with edge rails using both postgres and pg. The key here is:
- it has to be a bytea column
- you have to put a literal backslash in it (e.g obj.col = "lol \ omg"), save and reload
- there can be no unprintable octets in the string (0 to 31 and 127 to 255)
http://dev.rubyonrails.org/ticket/8049 and http://dev.rubyonrails.org/changeset/7329 provide some insight. In this massive refactoring, some conditional logic was added to unescape the binary string only when it looks like it's escaped (i.e. there are escaped octets). Note the new "if value =~ /\\\d{3}/" check that appears in two places (one for postgres/pg, the other for ruby-postgres)... this was later changed to the slightly less broken version "if value =~ /\\d{3}/". A relevant comment near the offending if statement reads:
# In each case, check if the value actually is escaped PostgreSQL bytea output # or an unescaped Active Record attribute that was just written.
There may have been reasons for this addition at one point, or perhaps it was always wrong. Regardless, today in the PostgreSQLAdapter, unescape_bytea is only ever called on strings that have just come from the database. So we should always unescape them, whether or not they "look" escaped. As Kurt points out, the check is unnecessary, potentially adds overhead, and breaks things.
As a side note, it would be nice if pg/postgres/ruby-postgres simply knew which columns were binary and which ones weren't when returning a result set, in which case it could handle the unescaping for us. That may not be practical and would certainly break a lot of existing code, so it's probably best to simply fix unescape_bytea to work like it used to.
-
Jon Jensen August 14th, 2009 @ 01:21 AM
Sorry, some of my backslashes got mangled (what else is new? ;).
- Example in bullet point 2 should read
obj.col = "lol \\ omg"
- Initial if statement should read
if value =~ /\\\\\d{3}/
- Second if statement should read
if value =~ /\\\d{3}/
- Example in bullet point 2 should read
-
Rizwan Reza January 20th, 2010 @ 11:36 AM
- Tag changed from 2.2-stable, 2.3-stable, activerecord, bugmash, postgres, postgresql to 2.2-stable, 2.3-stable, activerecord, postgres, postgresql
- Milestone changed from 2.x to 2.3.6
Does this still bug still apply on 2.3.5?
-
Dan Pickett January 25th, 2010 @ 02:13 AM
I can confirm I'm having difficulty with this - it appears like it's doing some extra escaping or something, but something gets malformed.
-
Rizwan Reza January 25th, 2010 @ 09:41 AM
- Tag changed from 2.2-stable, 2.3-stable, activerecord, postgres, postgresql to 2.3-stable, 2.3.5, activerecord, postgres, postgresql
-
...Paul March 8th, 2010 @ 06:50 PM
I think I've run into a very similar version of this with Rails 2.3.5, and this is what I've discovered:
The trick is the PostgresqlAdapter's handling of standard_conforming_strings, and the postgres/pg gem automagically figuring that out already.
The pg gem (0.8.0 or 0.9.0; quite possibly the older postgres 0.79 gem as well, since they probably all rely on the libpg C client lib to do this) is aware of the state of the server's standard_conforming_strings option. This option determines whether or not backslashes need to be escaped (in SQL standard, they do not; Postgres uses them as an escape character, and therefore need to be escaped). When s_c_s is turned on, the native escape function is NOT escaping backslashes, because it doesn't need to -- it's not expecting Rails to put an E before the string (which pushes Postgres back to its non-standard needs-escaping pattern).
However, in PostgresqlAdapter, the quoted_string_prefix method is being defined to return 'E' regardless of what the setting of s_c_s is, when really it should only be doing that if the setting is "off" (and even then, it shouldn't need to, because the pg client lib is doing the right thing already).
So check to see if you have standard_conforming_strings turned on, in your postgresql.conf file. Perhaps this needs to be a separate bug to fix the PostgresqlAdapter's handling of the s_c_s (or maybe, if the behavior of the old postgres gem is different than the newer pg gems, the adapter just needs to be more aware of which gem it's using?)
-
...Paul March 8th, 2010 @ 07:28 PM
If you can't change your database configuration, try this as an initializer:
module ActiveRecord module ConnectionAdapters class PostgreSQLAdapter alias original_connect connect def connect original_connect self.class.instance_eval do define_method(:quoted_string_prefix) { '' } end end end end end
If you're using a recent postgres lib (which is already obeying the standard_conforming_strings setting), and your standard_conforming_strings is turned on, this might solve the problems. It solved ours.
-
Jeremy Kemper March 8th, 2010 @ 08:18 PM
- State changed from new to open
- Assigned user set to Jeremy Kemper
Paul, thanks for the analysis.
We need a patch and test demonstrating that no quoting prefix is used when standard_conforming_strings is set.
Any takers? 2.3.6 will release shortly so let's get this fix out the door.
-
...Paul March 8th, 2010 @ 09:10 PM
- Assigned user cleared.
It's not just dropping the prefix when standard_conforming_strings is set to "on" -- it's also about checking that the Postgres lib being used (pg, postgres, whatever) is doing the right thing with its escaping.
So the logic should probably be something like, if s_c_s is "on", attempt to call the escaping function with "foo \". If the result is "foo \", then don't prefix the quoted string with anything. If the result is "foo \\", then you DO want to prefix the quoted string with an E...
-
...Paul March 8th, 2010 @ 09:11 PM
- Assigned user set to Jeremy Kemper
Oops, didn't mean to clear the responsible party... And I note my backslashes got eaten a bit, but I think the point is obvious.
-
Jeremy Kemper March 8th, 2010 @ 09:16 PM
That's a great idea. Just feature check. Now for a patch... :)
-
Kurt Stephens March 9th, 2010 @ 06:39 AM
Unless I'm completely mistaken -- The E'' prefix has nothing to do with this bug. The E'' prefix is the extended escape sequence that Postgresql-8.3 (and up) supports for non-ANSI SQL standard string escape sequences. This is postgresql's way of dealing with deficiencies/ambiguities in the ANSI SQL standard.
This issue is that the ActiveRecord value =~ /\\\d{3}/ "heuristic" that decides on how unescape BYTEA data is completely WRONG and UNNECESSARY and slow.
Just let the low-level Postgres driver do its job of handling BYTEA data -- there is no reason for ActiveRecord to do any thing
other than passing the raw unescaped data to the correct Postgres unescape_bytea() function.If the low-level postgres lib (pb, postgres, or whatever) is broken, don't fix it in ActiveRecord.
-
...Paul March 10th, 2010 @ 12:21 AM
Kurt, I think you're right. I'll file a new bug about the problem with standard_conforming_strings and Rails, and copy some of the info from this thread to that one.
-
Jeremy Kemper March 10th, 2010 @ 12:58 AM
This ticket is good Paul, keeps history. Sounds like the answer is to simply remove the stale heuristic.
-
Jeremy Kemper April 25th, 2010 @ 12:20 AM
- State changed from open to resolved
This has been sitting stale forever, so I made some simple changes:
http://github.com/rails/rails/commit/c9e15709aef6baed329157adeef580...
http://github.com/rails/rails/commit/5c0ad822363f15d372caaee8b39f3c...Reopen with a failing test case if you have any issues.
Create your profile
Help contribute to this project by taking a few moments to create your personal profile. Create your profile »
<h2 style="font-size: 14px">Tickets have moved to Github</h2>
The new ticket tracker is available at <a href="https://github.com/rails/rails/issues">https://github.com/rails/rails/issues</a>
People watching this ticket
Referenced by
- 2410 Postgres 8.3.6: base_test: test_quote fails due to not quoting escape characters Ticket 2412 has been marked duplicate, while the original...
- 2412 [Patch] Postgres support for standard_conforming_strings = on Ah, this bug was already filed, great. I sparked some com...