This project is archived and is in readonly mode.

#1191 ✓wontfix
Stephen Celis

Marshal serialized attributes

Reported by Stephen Celis | October 8th, 2008 @ 05:00 PM | in 2.x

This is a simple patch to use Marshal instead of YAML for attribute serialization. Marshaling is significantly faster (see below), and fixes some YAML load issues (including outstanding ticket #857).


require 'benchmark'
require 'yaml'
require 'rubygems'
require 'activesupport'

yaml = YAML.dump([])
marshal = Marshal.dump([])

Benchmark.bm(35) do |x|
  x.report('YAML: dump/load') {
    10_000.times { YAML.load(YAML.dump([])) }
  }
  x.report('Marshal: dump/load') {
    10_000.times { Marshal.load(Marshal.dump([])) }
  }
  x.report('YAML: load') {
    10_000.times { YAML.load(yaml) }
  }
  x.report('Marshal: load (with YAML detection)') {
    10_000.times { marshal.starts_with?('---') ? YAML.load(marshal) : Marshal.load(marshal) }
  }
end

#                                          user     system      total        real
# YAML: dump/load                      0.560000   0.020000   0.580000 (  1.836370)
# Marshal: dump/load                   0.040000   0.000000   0.040000 (  0.131774)
# YAML: load                           0.120000   0.020000   0.140000 (  0.447349)
# Marshal: load (with YAML detection)  0.060000   0.000000   0.060000 (  0.176883)

(Just to note: all tests still pass.)

Comments and changes to this ticket

  • Michael Koziarski

    Michael Koziarski October 8th, 2008 @ 05:23 PM

    This will break every application which uses serialized data.

    A patch which makes this an option could be applied, but not this.

  • Stephen Celis

    Stephen Celis October 8th, 2008 @ 05:28 PM

    No, it doesn't break applications because the attribute string is detected and unserialized as YAML or Marshal accordingly.

  • Michael Koziarski

    Michael Koziarski October 8th, 2008 @ 05:36 PM

    Well, it kinda depends on your definition of breaks, but silently changing what gets written to the database is a little surprising.

    
    serialize :foo, :with=>:marshall
    

    would achieve the same thing without the risk, so that seems to be the best of both worlds?

  • Stephen Celis

    Stephen Celis October 8th, 2008 @ 05:37 PM

    Actually, let me double-check the patch. (Nursing a head cold.) I'll update the ticket accordingly.

  • Stephen Celis

    Stephen Celis October 8th, 2008 @ 05:39 PM

    Well, it kinda depends on your definition of breaks, but silently changing what gets written to the database is a little surprising.

    I guess so, though I would favor Marshal as the default with perhaps a deprecation notice, given the performance improvement.

  • Stephen Celis

    Stephen Celis October 8th, 2008 @ 05:52 PM

    I've double-checked, and YAML-serialized attributes will still correctly deserialize.

    I'd be happy to make :marshal an option, but still think that it would be a preferable default. Does anyone else want to weigh in on this?

  • Michael Koziarski

    Michael Koziarski October 8th, 2008 @ 05:55 PM

    We could do it with an option. Defaulting to yaml, with a deprecation warning that in 2.3 it will default to :marshal.

    Then make the switch after we branch, that'll give people plenty of warning, and let you avoid the .starts_with?("---" stuff

  • Stephen Celis

    Stephen Celis October 8th, 2008 @ 06:02 PM

    Frederick Cheung makes some valid points on the list:

    While the option is fair enough, I don't thing all existing apps wouldn't want this turned on "silently": 1 if other people use your database yaml is ok as there are parsers for it in many languages whereas Marshal would be a PITA 2 if your existing column is not a blob column (which it wouldn't have to be previously since yaml generates plain text), the database will throw a hissy fit (or just truncate the data) when you try to insert a character that is not legal in the charset used. 3 should you be calling string_to_binary if the column supports it?

    I'll work on making it an option and attach an updated patch. I'm less sure it should be a future default, though, because of the first 2 points.

  • josh

    josh October 11th, 2008 @ 07:07 PM

    • Assigned user changed from “josh” to “Michael Koziarski”
  • Stephen Celis

    Stephen Celis October 11th, 2008 @ 07:18 PM

    Let's just close this one out for now.

  • Michael Koziarski

    Michael Koziarski October 12th, 2008 @ 05:32 PM

    • State changed from “new” to “wontfix”

    as requested

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile »

<h2 style="font-size: 14px">Tickets have moved to Github</h2>

The new ticket tracker is available at <a href="https://github.com/rails/rails/issues">https://github.com/rails/rails/issues</a>

People watching this ticket

Attachments

Pages