This project is archived and is in readonly mode.

[PATCH] SAX-based backend for XmlMini, using Nokogiri and/or LibXML
Reported by Willem van Bergen | December 30th, 2009 @ 03:28 PM
I wrote a SAX-based backend for XmlMini using Nokogiri. It is way faster and more memory-efficient than the default Nokogiri backend. Results of the benchmark at http://github.com/stepheneb/rails_hash_from_xml :
Rehearsal -----------------------------------------------
rexml        24.050000   0.140000  24.190000 ( 24.386877)
libxml        0.750000   0.020000   0.770000 (  0.784579)
nokogiri     10.840000   0.090000  10.930000 ( 11.222716)
nokogirisax   1.060000   0.010000   1.070000 (  1.121834)
------------------------------------- total: 36.960000sec
                  user     system      total        real
rexml        24.110000   0.110000  24.220000 ( 24.597668)
libxml        0.460000   0.010000   0.470000 (  0.456876)
nokogiri     10.480000   0.030000  10.510000 ( 10.547534)
nokogirisax   0.940000   0.000000   0.940000 (  0.933274)
Not as fast as the LibXML backend, but that one is not fully compatible, and based on abandonware ;-)
Moreover, the SAX document that is being used to build the hash can be switched, which is nice if you want to "fix" some faulty XML that is being sent to your application. This is why I wrote this backend in the first place.
I wrote the patch against the 2-3-stable branch, but it applies cleanly to the master branch. All tests run OK when I switch the default REXML backend for this implementation.
Comments and changes to this ticket
- 
            
         Bart ten Brinke December 30th, 2009 @ 04:23 PMWow, a factor 10 speed increase is really insane. 
 Patch applies and suite runs perfectly.
 +1
- 
            
         Bart ten Brinke December 30th, 2009 @ 04:27 PM- Assigned user set to Jeremy Kemper
 
- 
         
- 
            
         
- 
            
         Willem van Bergen January 1st, 2010 @ 11:39 AMI have also rewrote the current LibXML and Nokogiri backends to fix bugs and improve speeds, for both Rails 3 and Rails 2.3. See the ticket for patches at https://rails.lighthouseapp.com/projects/8994/tickets/3641 The new tests I wrote for that ticket, show that this SAX-based backend still has issues (just like the default Nokogiri backend and LibXML backend) and is not compatible with Rails 3. I will upload a new SAX-based patch here soon that resolves these issues. 
- 
            
         Willem van Bergen January 1st, 2010 @ 12:48 PMI attached a file which implements two SAX-based parsers for XmlMini, using both Nokogiri and LibXML. Currently, this patch is only for the master branch. I will attach a backported version later today for Rails 2.3. Performance comparisonThe following results are for the Rails 3 branch. The REXML, LibXML and Nokogiri results are the unchanged version. The LibXML++ and Nokogiri++ versions are the patched version from my other ticket (https://rails.lighthouseapp.com/projects/8994/tickets/3641). user system total real REXML 17.170000 0.060000 17.230000 ( 17.297263) LibXML 2.100000 0.100000 2.200000 ( 2.217380) LibXML++ 0.530000 0.000000 0.530000 ( 0.531034) LibXMLSAX 0.630000 0.010000 0.640000 ( 0.632472) Nokogiri 5.280000 0.020000 5.300000 ( 5.322575) Nokogiri++ 1.840000 0.020000 1.860000 ( 1.872055) NokogiriSAX 0.770000 0.000000 0.770000 ( 0.778777)As it seems, the improved LibXML++ implementation is the fastest, but if you want to stick to Nokogiri, the SAX-based parser comes close. 
- 
            
         
- 
            
         Willem van Bergen January 1st, 2010 @ 01:04 PM- Tag changed from activesupport, backend, nokogiri, patch, sax, xmlmini to activesupport, backend, libxml, nokogiri, patch, sax, xmlmini
- Title changed from SAX/Nokogiri backend for XmlMini to [PATCH] SAX-based backend for XmlMini, using Nokogiri and/or LibXML
 
- 
         Repository January 1st, 2010 @ 09:19 PM(from [37c51594b9610469173f3deee1ffdda4beb3e397]) Added two SAX-based backends for XmlMini, using both LibXML and Nokogiri. [#3636] Signed-off-by: Jeremy Kemper jeremy@bitsweat.net 
 http://github.com/rails/rails/commit/37c51594b9610469173f3deee1ffdd...
- 
         Repository January 1st, 2010 @ 09:19 PM- State changed from new to committed
 (from [689984ddd3a482b5c0986fdf1889323f096050fa]) Fixed some bugs and fixed some tests in new SAX-based XmlMini backends. [#3636 state:committed] Signed-off-by: Jeremy Kemper jeremy@bitsweat.net 
 http://github.com/rails/rails/commit/689984ddd3a482b5c0986fdf188932...
- 
            
         
Create your profile
Help contribute to this project by taking a few moments to create your personal profile. Create your profile »
<h2 style="font-size: 14px">Tickets have moved to Github</h2>
The new ticket tracker is available at <a href="https://github.com/rails/rails/issues">https://github.com/rails/rails/issues</a>
People watching this ticket
Attachments
Referenced by
- 
         3641 
          [PATCH] XmlMini - Fixed bugs and improved speed of LibXML and Nokogiri backend
        Note that I also wrote a SAX-based backend using Nokogiri... 3641 
          [PATCH] XmlMini - Fixed bugs and improved speed of LibXML and Nokogiri backend
        Note that I also wrote a SAX-based backend using Nokogiri...
- 
         3636 
          [PATCH] SAX-based backend for XmlMini, using Nokogiri and/or LibXML
        [#3636] 3636 
          [PATCH] SAX-based backend for XmlMini, using Nokogiri and/or LibXML
        [#3636]
- 
         3636 
          [PATCH] SAX-based backend for XmlMini, using Nokogiri and/or LibXML
        [#3636 state:committed] 3636 
          [PATCH] SAX-based backend for XmlMini, using Nokogiri and/or LibXML
        [#3636 state:committed]
 Jeremy Kemper
      Jeremy Kemper
 Mike Dalessio
      Mike Dalessio
 Willem van Bergen
      Willem van Bergen