This project is archived and is in readonly mode.
[PATCH] SAX-based backend for XmlMini, using Nokogiri and/or LibXML
Reported by Willem van Bergen | December 30th, 2009 @ 03:28 PM
I wrote a SAX-based backend for XmlMini using Nokogiri. It is way faster and more memory-efficient than the default Nokogiri backend. Results of the benchmark at http://github.com/stepheneb/rails_hash_from_xml :
Rehearsal -----------------------------------------------
rexml 24.050000 0.140000 24.190000 ( 24.386877)
libxml 0.750000 0.020000 0.770000 ( 0.784579)
nokogiri 10.840000 0.090000 10.930000 ( 11.222716)
nokogirisax 1.060000 0.010000 1.070000 ( 1.121834)
------------------------------------- total: 36.960000sec
user system total real
rexml 24.110000 0.110000 24.220000 ( 24.597668)
libxml 0.460000 0.010000 0.470000 ( 0.456876)
nokogiri 10.480000 0.030000 10.510000 ( 10.547534)
nokogirisax 0.940000 0.000000 0.940000 ( 0.933274)
Not as fast as the LibXML backend, but that one is not fully compatible, and based on abandonware ;-)
Moreover, the SAX document that is being used to build the hash can be switched, which is nice if you want to "fix" some faulty XML that is being sent to your application. This is why I wrote this backend in the first place.
I wrote the patch against the 2-3-stable branch, but it applies cleanly to the master branch. All tests run OK when I switch the default REXML backend for this implementation.
Comments and changes to this ticket
-
Bart ten Brinke December 30th, 2009 @ 04:23 PM
Wow, a factor 10 speed increase is really insane.
Patch applies and suite runs perfectly.
+1 -
Bart ten Brinke December 30th, 2009 @ 04:27 PM
- Assigned user set to Jeremy Kemper
-
Willem van Bergen January 1st, 2010 @ 11:39 AM
I have also rewrote the current LibXML and Nokogiri backends to fix bugs and improve speeds, for both Rails 3 and Rails 2.3. See the ticket for patches at https://rails.lighthouseapp.com/projects/8994/tickets/3641
The new tests I wrote for that ticket, show that this SAX-based backend still has issues (just like the default Nokogiri backend and LibXML backend) and is not compatible with Rails 3. I will upload a new SAX-based patch here soon that resolves these issues.
-
Willem van Bergen January 1st, 2010 @ 12:48 PM
I attached a file which implements two SAX-based parsers for XmlMini, using both Nokogiri and LibXML. Currently, this patch is only for the master branch. I will attach a backported version later today for Rails 2.3.
Performance comparison
The following results are for the Rails 3 branch. The REXML, LibXML and Nokogiri results are the unchanged version. The LibXML++ and Nokogiri++ versions are the patched version from my other ticket (https://rails.lighthouseapp.com/projects/8994/tickets/3641).
user system total real REXML 17.170000 0.060000 17.230000 ( 17.297263) LibXML 2.100000 0.100000 2.200000 ( 2.217380) LibXML++ 0.530000 0.000000 0.530000 ( 0.531034) LibXMLSAX 0.630000 0.010000 0.640000 ( 0.632472) Nokogiri 5.280000 0.020000 5.300000 ( 5.322575) Nokogiri++ 1.840000 0.020000 1.860000 ( 1.872055) NokogiriSAX 0.770000 0.000000 0.770000 ( 0.778777)
As it seems, the improved LibXML++ implementation is the fastest, but if you want to stick to Nokogiri, the SAX-based parser comes close.
-
Willem van Bergen January 1st, 2010 @ 01:04 PM
- Tag changed from activesupport, backend, nokogiri, patch, sax, xmlmini to activesupport, backend, libxml, nokogiri, patch, sax, xmlmini
- Title changed from SAX/Nokogiri backend for XmlMini to [PATCH] SAX-based backend for XmlMini, using Nokogiri and/or LibXML
-
Repository January 1st, 2010 @ 09:19 PM
(from [37c51594b9610469173f3deee1ffdda4beb3e397]) Added two SAX-based backends for XmlMini, using both LibXML and Nokogiri.
[#3636]
Signed-off-by: Jeremy Kemper jeremy@bitsweat.net
http://github.com/rails/rails/commit/37c51594b9610469173f3deee1ffdd... -
Repository January 1st, 2010 @ 09:19 PM
- State changed from new to committed
(from [689984ddd3a482b5c0986fdf1889323f096050fa]) Fixed some bugs and fixed some tests in new SAX-based XmlMini backends.
[#3636 state:committed]
Signed-off-by: Jeremy Kemper jeremy@bitsweat.net
http://github.com/rails/rails/commit/689984ddd3a482b5c0986fdf188932...
Create your profile
Help contribute to this project by taking a few moments to create your personal profile. Create your profile »
<h2 style="font-size: 14px">Tickets have moved to Github</h2>
The new ticket tracker is available at <a href="https://github.com/rails/rails/issues">https://github.com/rails/rails/issues</a>
People watching this ticket
Attachments
Referenced by
- 3641 [PATCH] XmlMini - Fixed bugs and improved speed of LibXML and Nokogiri backend Note that I also wrote a SAX-based backend using Nokogiri...
- 3636 [PATCH] SAX-based backend for XmlMini, using Nokogiri and/or LibXML [#3636]
- 3636 [PATCH] SAX-based backend for XmlMini, using Nokogiri and/or LibXML [#3636 state:committed]