This project is archived and is in readonly mode.

#3446 ✓wontfix
xpmatteo

tables_in_string matches literal string containing dot

Reported by xpmatteo | October 31st, 2009 @ 03:49 PM

When a query string contains dots, such as @@@"x = 'foo.bar'"@@@, the code that tries to understand which tables are mentioned in the query will mistake "foo" for a table name. The consequence is that sometimes, a query with :include will be executed with a join rather than with separate queries. And that can make the behaviour of ActiveRecord depend on user input. For example,

:conditions => ["x = ?", params[:input]]

In rare cases, this can cause a query to fail.

I provide a patch that tests for the problem and fixes it. Although I'm not comfortable with this solution; it seems to me the code in tables_in_string is too naive. This problem is similar to the one reported in ticket #532, which was fixed by improving the regexp.

Comments and changes to this ticket

  • xpmatteo

    xpmatteo October 31st, 2009 @ 03:51 PM

    • Assigned user changed from “Tarmo Tänav” to “Pratik”
  • xpmatteo

    xpmatteo November 5th, 2009 @ 01:29 PM

    Improved the regexp (and test) to filter out all string literals. (I think...)

  • Graeme Mathieson

    Graeme Mathieson June 15th, 2010 @ 03:05 PM

    I've just run across this issue. I created a test case which shows it off, using searchlogic, because I initially assumed the error was in there:

    http://gist.github.com/439071

    I'll give xpmatteo's patch a try, see if that fixes my test case.

  • Graeme Mathieson
  • Graeme Mathieson

    Graeme Mathieson June 15th, 2010 @ 03:21 PM

    Dodgy monkey patch for anyone else who stumbles across this:

    ActiveRecord::Associations::ClassMethods.module_eval do
      def tables_in_string(string)
        return [] if string.blank?
        string.gsub(/'(\\'|[^'])*'|"(\\"|[^"])*"/, '').scan(/([a-zA-Z_][\.\w]+).?\./).flatten
      end
    end
    
  • Simon Kaczor

    Simon Kaczor June 23rd, 2010 @ 05:16 PM

    The patch creates other issues. tables_in_string needs to return actual tables names:

      def tables_in_string(string)
        return [] if string.blank?
        string.gsub(/'(.*)'/) {|s| s.gsub('.', '_')}.scan(/([\.a-zA-Z_]+).?\./).flatten
      end
    
  • Andrew Selder

    Andrew Selder December 8th, 2010 @ 11:15 PM

    • Tag set to 2.3.10, 3.0.3, active_record, tables_in_string
    • Importance changed from “” to “”

    This is still an open issue in ActiveRecord 3.0.3 and 2.3.10

    bad = "((((((LOWER(events.name) LIKE '%ga.fs%')))) AND (events.delete_flag = 0)) AND (events.date >= curdate()))"

    In AR 3.0.3
    bad.scan(/([a-zA-Z_][.\w]+).?./).flatten.map{ |s| s.downcase }.uniq => ["events", "ga"]

    In AR 2.3.10
    bad.scan(/([.a-zA-Z_]+).?./).flatten => ["events", "ga", "events", "events"]

  • Andrew Selder

    Andrew Selder December 8th, 2010 @ 11:36 PM

    Simon's regex above is not sufficient unfortunately.

    Two problems:

    1. The first gsub's pattern is greedy, so if you have multiple literals table names can get missed

    example
    bad2 = "((((((LOWER(events.name) LIKE '%ga.fs%')))) AND ((((((LOWER(venues.name) LIKE '%ga.fs%')))) AND (events.delete_flag = 0)) AND (events.date >= curdate()))"

    bad2.gsub(/'(.*)'/) {|s| s.gsub('.', '')}.scan(/([.a-zA-Z]+).?./).flatten => ["events", "events", "events"]

    Here it lost the venues table.

    1. This only works for literals enclosed by single quotes. Literals could also be enclosed by double quotes.

    Here is my proposed updated regex:

    string.gsub(/(['"])(.*?)(\1)/) {|s| s.gsub('.', '')}.scan(/([.a-zA-Z]+).?./).flatten

    I'll be ginning up a patch against 2.3.10 and 3.0.3 shortly

  • Andrew Selder

    Andrew Selder December 9th, 2010 @ 12:50 AM

    Here are a couple of patches against master and 2-3-stable.

    It implements the regex I mentioned above.

  • Andrew Selder

    Andrew Selder December 9th, 2010 @ 12:55 AM

    • Tag changed from 2.3.10, 3.0.3, active_record, tables_in_string to 2.3.10, 3.0.3, active_record, patch, tables_in_string

    Let's actually try attaching this time.

  • Ed Ruder
  • Mark Towfiq

    Mark Towfiq December 9th, 2010 @ 01:01 AM

    Looks good. I'm glad one of the tests included the tablename.columnname case as it looks like the current regexp cares about that.

  • nate b

    nate b December 9th, 2010 @ 01:02 AM

    Good catch, Andrew. Worked perfectly for me.

  • Ed Ruder

    Ed Ruder December 9th, 2010 @ 01:03 AM

    +1 (Adding comment, this time.) My project got bit by this problem--I think this patch addresses the problem better than the previous patch.

  • Ed Ruder

    Ed Ruder December 9th, 2010 @ 01:09 AM

    • Tag changed from 2.3.10, 3.0.3, active_record, patch, tables_in_string to 2.3.10, 3.0.3, active_record, patch, tables_in_string, verified
  • Andrew Selder

    Andrew Selder December 9th, 2010 @ 01:15 AM

    • Tag changed from 2.3.10, 3.0.3, active_record, patch, tables_in_string, verified to 2.3.10, 3.0.3, active_record, patch, tables_in_string

    readding patches with the ticket update keyword stuff in commit comment

  • Andrew Selder

    Andrew Selder December 9th, 2010 @ 01:15 AM

    • Tag changed from 2.3.10, 3.0.3, active_record, patch, tables_in_string to 2.3.10, 3.0.3, active_record, patch, tables_in_string, verified
  • Andrew Selder
  • Andrew Selder

    Andrew Selder December 9th, 2010 @ 02:42 AM

    • Tag changed from 2.3.10, 3.0.3, active_record, patch, tables_in_string, verified to 2.3.10, 3.0.3, active_record, tables_in_string

    Jeremy Evans found a case where the regex doesn't work (embedded escaped quote in literal).

    I'll have the regex fixed and new patches up shortly.

  • Andrew Selder

    Andrew Selder December 9th, 2010 @ 04:25 AM

    After talking to some of the guys over on the core mailing list, this tables_in string was a temporary hack in the move to preloading. It will probably be coming out altogether in

    In Rails 3, use .preload(:bars) rather than include.

    In Rails 2, we'll just have to stick with the monkey patch method.

    Here's what I've come up with that seems to work with all the stuff people have mentioned.

    ActiveRecord::Associations::ClassMethods.module_eval do
      def tables_in_string(string)
        return [] if string.blank?
        literals_without_dots(string).scan(/([\.a-zA-Z_]+).?\./).flatten
      end
    
      def literals_without_dots(string) 
        string.gsub(/(['"])(([^\\]|(\\.))*?)(\1)/) {|s| s.gsub('.', '_')} 
      end
    end
    
  • Michael Koziarski

    Michael Koziarski December 9th, 2010 @ 07:36 PM

    As Andrew says, I'm loathe to climb further down the rabbit hole of trying to parse SQL with regexps. It's destined to fail.

    If you're bitten by this in 2-3-stable apps you can either monkey patch in an initializer, or use will's patch in #3683

  • Michael Koziarski

    Michael Koziarski December 9th, 2010 @ 08:04 PM

    • State changed from “new” to “wontfix”

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile »

<h2 style="font-size: 14px">Tickets have moved to Github</h2>

The new ticket tracker is available at <a href="https://github.com/rails/rails/issues">https://github.com/rails/rails/issues</a>

Referenced by

Pages