This project is archived and is in readonly mode.

#1069 ✓duplicate
Matt Hanlon

HTML::Node.parse has problems with CDATA

Reported by Matt Hanlon | September 18th, 2008 @ 06:21 PM | in 2.x

actionpack's HTML::Node.parse improperly handles some CDATA tags. It always raises when given an unclosed CDATA, and can't handle CDATA containing another CDATA open tag. For example:

# Can parse unclosed HTML tag with strict=false...
>> HTML::Node.parse(nil,0,0,'<foo', false)
=> #<HTML::Tag:0xb6510c28 @children=[], @closing=nil, @parent=nil, @position=0, @line=0, @attributes={}, @name="foo">

# ... but fails to parse unclosed CDATA tag
>> HTML::Node.parse(nil,0,0,'<![CDATA[', false)
NoMethodError: You have a nil object when you didn't expect it!
The error occurred while evaluating nil.gsub
        from /usr/local/lib/ruby/gems/1.8/gems/actionpack-2.1.0/lib/action_controller/vendor/html-scanner/html/node.rb:154:in `parse'
        from (irb):37

# Can't handle the string "<![CDATA[" inside a CDATA tag
>> HTML::Node.parse(nil,0,0,'<![CDATA[<![CDATA[]]>').content
=> ""

# to_s is missing a ']' in the ]]> close tag
>> HTML::Node.parse(nil,0,0,'<![CDATA[foo bar]]>').to_s
=> "<![CDATA[foo bar]>"

Patch for these issues is attached.

Comments and changes to this ticket

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile »

<h2 style="font-size: 14px">Tickets have moved to Github</h2>

The new ticket tracker is available at <a href=""></a>