This project is archived and is in readonly mode.

#1063 ✓resolved
Adam Majer

[PATCH] binary data corruption in PostgreSQL adapter

Reported by Adam Majer | September 17th, 2008 @ 04:53 PM

To create the problem, you only need to do,


  rails my_app; cd my_app
  script/generate model BorkedBinary b_data:binary

... fix config/database.yml so rails can connect ...


  rake db:migrate
  script/console
     BorkedBinary.create( :b_data => '\404' )

if it works correctly, you'll get '\404'

unfortunately, where b = BorkedData object,


>> b.b_data = '\404'
=> "\\404"
>> b.b_data
=> "404"

The problem is before the postgresql driver, as the query generated is broken,


BorkedBinary Create (0.001279)   INSERT INTO "borked_binaries" ("created_at", "updated_at", "b_data") VALUES('2008-09-17 07:49:54.316491', '2008-09-17 07:49:54.316491', E'404')

I've tried this with MySQL, PostgreSQL and SQLite3 adapters. Only PostgreSQL is affected.

Comments and changes to this ticket

  • Adam Majer

    Adam Majer September 17th, 2008 @ 05:14 PM

    In addition, only binary data with \ followed by 3 numbers seems to be affected.

  • Adam Majer

    Adam Majer September 17th, 2008 @ 09:07 PM

    Traced it back to the

    
       self.binary_to_string(value)
    

    in postgreSQL adapter. Basically, rails assumes that stored values are escaped by database and need unescaping!

    
    self.binary_to_string( value )
    return value
    end
    

    PGconn.unescape_bytea is ONLY to to be used to unescape database quoted stuff. It is NOT there to unescape PGconn.escape_byea !!

    
    irb(main):002:0> require 'pg'
    => true
    irb(main):003:0> PGconn.escape_bytea( '\404' )
    => "\\\\\\\\404"
    irb(main):004:0> '\404' == PGconn.unescape_bytea(PGconn.escape_bytea( '\404' ))
    => false
    irb(main):005:0> PGconn.unescape_bytea(PGconn.escape_bytea( '\404' ))
    => "\\\\404"
    
  • Adam Majer

    Adam Majer September 17th, 2008 @ 09:09 PM

    Just an aside, ignoring unescaping bit makes my example of '\404' work but of course breaks on actual data fetch from database...

  • Adam Majer

    Adam Majer September 17th, 2008 @ 09:26 PM

    • Tag changed from activerecord, bug, postgresql to 2.0-stable, 2.1, activerecord, bug, edge, postgresql

    Checked 2.0.2, 2-1-stable branch and latest HEAD. As expected, same problem.

  • Adam Majer

    Adam Majer September 18th, 2008 @ 11:32 PM

    • Title changed from “binary data broken with PostgreSQL adapter” to “binary data corruption in PostgreSQL adapter”
  • Adam Majer

    Adam Majer September 19th, 2008 @ 10:56 PM

    Added patch,

    http://git.debian.org/?p=collab-.../rails.git;a=shortlog;h=refs/heads/for-upstream

    The commit you want is which is at top and vs. edge as of now, 93199ad2ea35d8736a0db154e819583a6e4cf983

    Please add unit tests, if you wish.

  • Adam Majer

    Adam Majer September 19th, 2008 @ 10:58 PM

    • Title changed from “binary data corruption in PostgreSQL adapter” to “[PATCH] binary data corruption in PostgreSQL adapter”

    Proper link

    Ok, trying again with the link

  • Adam Majer

    Adam Majer September 19th, 2008 @ 10:59 PM

    
    http://git.debian.org/?p=collab-maint/rails.git;a=shortlog;h=refs/heads/for-upstream
    

    Just copy/paste it.

  • Michael Koziarski

    Michael Koziarski October 23rd, 2008 @ 06:03 PM

    • Milestone cleared.
    • Assigned user set to “Michael Koziarski”

    Can you email the core list about this patch Adam, it'd be good to get a few more postgresql users reporting success or failure.

  • Kane

    Kane October 24th, 2008 @ 02:20 PM

    If i use binary fields with postgresql my models have no problems with it, but if i use the execute method the result doesnt get unescaped and i have to do it manually.

    
    result = User.connection.execute "SELECT test FROM users WHERE id = 1;"
    

    results in

    
    #<PGresult:0x432ce3c @result=[["\\240\\216\\323\\012E\\230\\216\\337\\243\\223\\352\\255\\331\\3722\\273"]], @fields=["test"],
    @res=#<PostgresPR::Connection::Result:0x432ce28 @rows=[["\\240\\216\\323\\012E\\230\\216\\337\\243\\223\\352\\255\\331\\3722\\273"]], @fields=[#<struct PostgresPR::RowD
    escription::FieldInfo name="test", oid=29198, attr_nr=10, type_oid=17, typlen=-1, atttypmod=-1, formatcode=0>], @cmd_tag="SELECT">>
    

    and then i have to do this do get the correct data

    
    value = ActiveRecord::ConnectionAdapters::PostgreSQLColumn.send(:binary_to_string, result.result.first.first)
    

    does this patch also correct this problem by moving the binary escape/unescape from column to the driver or is this another issue? Maybe it isnt an issue at all and my useage is wrong, i dont have much knowlage of AR-internals.

  • Adam Majer

    Adam Majer October 24th, 2008 @ 03:45 PM

    Yes, it moves the escape/escape from column to driver. All binary data is unescaped when it is fetched from PostgreSQL driver and stored as binary inside ActiveRecord.

    You may not necessarily see problems in binary data corruption as it only happens when binary data contains \123 type of ASCII string.

    • Adam
  • Michael Koziarski

    Michael Koziarski October 24th, 2008 @ 08:42 PM

    Perhaps the best thing to do is describe a pattern we could use to see it affecting a running application?

    A simple test app or something? Some test files?

    As someone who never uses blobs, and hasn't touched postgres in a while, I'm really reluctant to pull the trigger on this without some examples...

  • Adam Majer

    Adam Majer October 24th, 2008 @ 10:25 PM

    BTW, regarding Kane's post, your usage is incorrect because,

    1. you are executing the query in the pg driver
    2. the result is PGresult - has nothing to do with AR
    3. you have to unescape the data because that's how PostgreSQL returns and AR should not meddle and change PGresult class. It would break existing application.

    You should not be calling AR to unescape the string. The correct usage is to use pg driver directly to unescape (after you apply my patch).

    res = User.connection.execute( 'SELECT column FROM table ....' )
    User.connection.unescape_bytea( res[0]['column'] )
    
    

    You know if you need to do this if

    # BYTEA type and format is escaped
    res.ftype(0) == 17 and res.fformat(0)==0
    
    

    Of course, this is very anti-AR. To be correct in spirit of AR, the connector should return result as an array, not as a PGresult. But that is completely different problem.

    • Adam

    PS. To work correctly with other adapters, and assuming they return something different than escaped blobs,

    res = User.connection.execute( 'SELECT column FROM table ....' ) data = res[0]['column'] if res.is_a? PGresult and res.ftype(0) == 17 and res.fformat(0)==0

     data = User.connection.unescape_bytea( data )
    
    

    end

  • Adam Majer

    Adam Majer October 24th, 2008 @ 10:30 PM

    Err, see my first post to the report? That is the test case right there with a step-by-step guide how to reproduce it.

    People generally do not do,

    model.blob_column = file model.md5_check = md5_sum( file ) model.save

    then when they fetch it, they do not check if model.blob_column is actually correct. They just dump it (mostly as images, I think) and if it works, it works. It fails, they assume something is wrong with initial images.

    I actually store the md5_sum for blobs that are later dumped to file system with a cronjob. I do this as I hate giving out write access to the filesystem for webserver user!

    • Adam

    PS. Real-world examples for files that have lots of \123 (or "\\123" as a string) patters will be all Microsoft Office files and related, like .MSI (Windows Installer)

  • Repository

    Repository October 25th, 2008 @ 11:57 AM

    • State changed from “new” to “committed”

    (from [a3dd69da6d76fd8375e8d4da39ed254f8fcc3238]) Fix binary data corruption bug in PostgreSQL adaptor

    1. Move the binary escape/unescape from column to the driver - we should store binary data AR just like most other adaptors
    2. check to make sure we only unescape bytea data PGresult.ftype( column ) == 17 that is passed to us in escaped format PGresult.fformat( column ) == 0

    Signed-off-by: Michael Koziarski michael@koziarski.com [#1063 state:committed] http://github.com/rails/rails/co...

  • Repository

    Repository October 25th, 2008 @ 11:57 AM

    (from [932dffc559ef188eb31d0223116e9da361833488]) Fix binary data corruption bug in PostgreSQL adaptor

    1. Move the binary escape/unescape from column to the driver - we should store binary data AR just like most other adaptors
    2. check to make sure we only unescape bytea data PGresult.ftype( column ) == 17 that is passed to us in escaped format PGresult.fformat( column ) == 0

    Signed-off-by: Michael Koziarski michael@koziarski.com [#1063 state:committed] http://github.com/rails/rails/co...

  • John Wilger

    John Wilger October 25th, 2008 @ 07:57 PM

    Having an issue which I /think/ is related to this patch. Using PostgreSQL 8.3.3 and the postgres gem version 0.7.9.2008.01.28 and the Rails v2.2.0 tag (932dffc559ef188eb31d0223116e9da361833488).

    Running either rake db:create or (after creating the database by hand) rake db:migrate, I get the following exception:

    
    jwilger Vidar:~/projects/johnwilger.com [1061]% rake db:migrate --trace                                                                              (master)
    (in /Users/jwilger/projects/johnwilger.com)
    ** Invoke db:migrate (first_time)
    ** Invoke environment (first_time)
    ** Execute environment
    ** Execute db:migrate
    rake aborted!
    NoMethodError: undefined method `fformat' for #<PGresult:0x22ee42c>: SHOW client_min_messages
    /Users/jwilger/projects/johnwilger.com/vendor/rails/activerecord/lib/active_record/connection_adapters/abstract_adapter.rb:189:in `log'
    /Users/jwilger/projects/johnwilger.com/vendor/rails/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb:490:in `query'
    /Users/jwilger/projects/johnwilger.com/vendor/rails/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb:711:in `client_min_messages'
    /Users/jwilger/projects/johnwilger.com/vendor/rails/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb:257:in `supports_standard_conforming_strings?'
    /Users/jwilger/projects/johnwilger.com/vendor/rails/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb:942:in `connect'
    /Users/jwilger/projects/johnwilger.com/vendor/rails/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb:210:in `initialize'
    /Users/jwilger/projects/johnwilger.com/vendor/rails/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb:37:in `new'
    /Users/jwilger/projects/johnwilger.com/vendor/rails/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb:37:in `postgresql_connection'
    /Users/jwilger/projects/johnwilger.com/vendor/rails/activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb:215:in `send'
    /Users/jwilger/projects/johnwilger.com/vendor/rails/activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb:215:in `new_connection'
    /Users/jwilger/projects/johnwilger.com/vendor/rails/activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb:237:in `checkout_new_connection'
    /Users/jwilger/projects/johnwilger.com/vendor/rails/activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb:180:in `checkout'
    /Users/jwilger/projects/johnwilger.com/vendor/rails/activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb:176:in `loop'
    /Users/jwilger/projects/johnwilger.com/vendor/rails/activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb:176:in `checkout'
    /usr/local/lib/ruby/1.8/monitor.rb:242:in `synchronize'
    /Users/jwilger/projects/johnwilger.com/vendor/rails/activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb:175:in `checkout'
    /Users/jwilger/projects/johnwilger.com/vendor/rails/activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb:90:in `connection'
    /Users/jwilger/projects/johnwilger.com/vendor/rails/activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb:321:in `retrieve_connection'
    /Users/jwilger/projects/johnwilger.com/vendor/rails/activerecord/lib/active_record/connection_adapters/abstract/connection_specification.rb:121:in `retrieve_connection'
    /Users/jwilger/projects/johnwilger.com/vendor/rails/activerecord/lib/active_record/connection_adapters/abstract/connection_specification.rb:113:in `connection'
    /Users/jwilger/projects/johnwilger.com/vendor/rails/activerecord/lib/active_record/migration.rb:429:in `initialize'
    /Users/jwilger/projects/johnwilger.com/vendor/rails/activerecord/lib/active_record/migration.rb:394:in `new'
    /Users/jwilger/projects/johnwilger.com/vendor/rails/activerecord/lib/active_record/migration.rb:394:in `up'
    /Users/jwilger/projects/johnwilger.com/vendor/rails/activerecord/lib/active_record/migration.rb:377:in `migrate'
    /Users/jwilger/projects/johnwilger.com/vendor/rails/railties/lib/tasks/databases.rake:111
    /usr/local/lib/ruby/gems/1.8/gems/rake-0.8.3/lib/rake.rb:617:in `call'
    /usr/local/lib/ruby/gems/1.8/gems/rake-0.8.3/lib/rake.rb:617:in `execute'
    /usr/local/lib/ruby/gems/1.8/gems/rake-0.8.3/lib/rake.rb:612:in `each'
    /usr/local/lib/ruby/gems/1.8/gems/rake-0.8.3/lib/rake.rb:612:in `execute'
    /usr/local/lib/ruby/gems/1.8/gems/rake-0.8.3/lib/rake.rb:578:in `invoke_with_call_chain'
    /usr/local/lib/ruby/1.8/monitor.rb:242:in `synchronize'
    /usr/local/lib/ruby/gems/1.8/gems/rake-0.8.3/lib/rake.rb:571:in `invoke_with_call_chain'
    /usr/local/lib/ruby/gems/1.8/gems/rake-0.8.3/lib/rake.rb:564:in `invoke'
    /usr/local/lib/ruby/gems/1.8/gems/rake-0.8.3/lib/rake.rb:2019:in `invoke_task'
    /usr/local/lib/ruby/gems/1.8/gems/rake-0.8.3/lib/rake.rb:1997:in `top_level'
    /usr/local/lib/ruby/gems/1.8/gems/rake-0.8.3/lib/rake.rb:1997:in `each'
    /usr/local/lib/ruby/gems/1.8/gems/rake-0.8.3/lib/rake.rb:1997:in `top_level'
    /usr/local/lib/ruby/gems/1.8/gems/rake-0.8.3/lib/rake.rb:2036:in `standard_exception_handling'
    /usr/local/lib/ruby/gems/1.8/gems/rake-0.8.3/lib/rake.rb:1991:in `top_level'
    /usr/local/lib/ruby/gems/1.8/gems/rake-0.8.3/lib/rake.rb:1970:in `run'
    /usr/local/lib/ruby/gems/1.8/gems/rake-0.8.3/lib/rake.rb:2036:in `standard_exception_handling'
    /usr/local/lib/ruby/gems/1.8/gems/rake-0.8.3/lib/rake.rb:1967:in `run'
    /usr/local/lib/ruby/gems/1.8/gems/rake-0.8.3/bin/rake:31
    /usr/local/bin/rake:19:in `load'
    /usr/local/bin/rake:19
    
  • John Wilger

    John Wilger October 25th, 2008 @ 08:00 PM

    Also, just checked by reverting to the previous commit of Rails, and I no longer received this error.

  • Michael Koziarski

    Michael Koziarski October 26th, 2008 @ 10:15 AM

    • State changed from “committed” to “open”

    Adam, your changes need to work with post postgres and pg gems, PG works fine as that's what I have locally.

    John, do you know what the equivalent calls are for the postgres gem?

    •    unescape_col << ( res.fformat(j)==0 and res.ftype(j)==17 )
      
      
  • Tarmo Tänav

    Tarmo Tänav October 26th, 2008 @ 10:21 AM

    Looking through the postgres gem source there does not appear to be any references to PQfformat which is what fformat() wraps, ftype however is there and should work the same way.

  • Michael Koziarski

    Michael Koziarski October 26th, 2008 @ 06:08 PM

    OK, so if the behaviour can't be reproduced in the postgres gem, hopefully there's a reasonable alternative (perhaps only using ftype?).

    Failing that we can just return the result unmodified and log a warning. Then document in the release notes for good blob support, upgrade to the pg gem.

  • Adam Majer

    Adam Majer October 26th, 2008 @ 07:00 PM

    Only using ftype is good enough in current ActiveRecord implementation. PostgreSQL will always return escapped binary on queries unless it is told so otherwise. And the only way to tell it to send unescape binary data is explicitly through bound parameters - AR doesn't use them :(

    http://www.postgresql.org/docs/8...

    Basically, queries would have to go through PQexecParams or similar (see the description for the resultFormat parameter)

    So, removing reference to fformat in the patch will not currently affect it.

    Cheers, Adam

  • Michael Koziarski

    Michael Koziarski October 26th, 2008 @ 07:07 PM

    OK, I've removed those references. John, can you verify it works as you wanted now?

    Adam, if you're interested in helping to tidy up our query generation (to add support for proper bound parameters) we can start looking into that after 2.2 is branched.

    Basically it's a common refrain, but we've yet to get the critical mass required to dive into associations.rb

  • John Wilger

    John Wilger October 27th, 2008 @ 06:37 PM

    Commit 9e2bb2caff2b6fd4712ca3db258b68a588a69e9a seems to have fixed the issue I was having. Thanks!

  • Michael Koziarski

    Michael Koziarski October 27th, 2008 @ 06:39 PM

    • State changed from “open” to “resolved”

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile »

<h2 style="font-size: 14px">Tickets have moved to Github</h2>

The new ticket tracker is available at <a href="https://github.com/rails/rails/issues">https://github.com/rails/rails/issues</a>

Referenced by

Pages