Pickle: dealing with updated class definitions
Asked Answered
O

3

23

After a class definition is updated by recompiling a script, pickle refuses to serialize previously instantiated objects of that class, giving the error: "Can't pickle object: it's not the same object as "

Is there a way to tell pickle that it should ignore such cases? To just identify classes by name, ignore whichever internal unique ID is causing the mismatch?

I would definitely welcome as an answer the suggestion of an alternative, equivalent module which solves this problem in a convenient and robust manner.


For reference, here's my motivation:

I am creating a high productivity, rapid iteration development environment in which Python scripts are edited live. Scripts are repeatedly recompiled, but data persists across compiles. As part of the productivity goals, I am trying to use pickle for serialization, to avoid the cost of writing and updating explicit serialization code for constantly changing data structures.

Mostly I serialize built-in types. I am careful to avoid meaningful changes in the classes which I pickle, and when necessary I use the copy_reg.pickle mechanism to perform upconversion on unpickle.

Script recompilation prevents me from pickling objects at all, even if class definitions have not actually changed (or have only changed in a benign way).

Octosyllable answered 28/4, 2013 at 23:18 Comment(1)
I've not spent much time with this, but this may be useful: docs.python.org/2/library/…Acanthoid
B
14

Unless you can unpack the earlier version of the class definition, the reference pickle needs to dump and load the instance is now gone. So this is "not possible".

However, if you did want to do it, you could save previous versions of your class definitions... and then it would just be that you'd have to trick pickle into referring to your old/saved class definitions, and not using the most current ones -- which might just amount to editing obj.__class__ or obj.__module__ to point to your old class. There may also be some other odd things in your class instance that also refer to the old class definition that you'd have to handle. Also, if you add or delete a class method, you may run in to some unexpected results, or have to deal with updating the instance accordingly. Another interesting twist is that you could make the unpickler always use the most current version of your class.

My serialization package, dill, has some methods that can dump compiled source from a live code object to a temporary file, and then serialize using that temporary file. It's one of the newer parts of the package, so it's not as robust as the rest of dill. Also, your use case is not a use case I'd considered, but I could see how it would be a nice feature to have.

Beaded answered 14/10, 2013 at 14:42 Comment(3)
Ok, I have added this feature to dill in the latest revision on github. Implemented with far less trickery than I thought... just serialize the class definition with the pickle, and voila.Beaded
weird I want to do the opposite. I used dill to save an object but when I call the new method it does not return to me the string I expect. So it's likely using the old definition? How do I force load the old definition after I have loaded the old object using dill?Bendy
@CharlieParker: Hard to tell what you are asking without an example. Maybe you could submit a new question on SO, or fill out an issue on Github?Beaded
O
4

There is a simple way to do it that is basically User's answer.

First I will give the failing code:

#Tested with Python 3.6.7
import pickle
class Foo:
    pass
foo = Foo()
class Foo:
    def bar(self):
        return 0
pickle.dumps(foo) #raises PicklingError: Can't pickle <class '__main__.Foo'>: it's not the same object as __main__.Foo

To fix this problem, just reset the __class__ attribute of foo before pickling as in User's answer:

import pickle
class Foo:
    pass
foo = Foo()
class Foo:
    def bar(self):
        return 0
foo.__class__ = eval(foo.__class__.__name__) #reset __class__ attribute
pickle.dumps(foo) #works fine

This solution only works if you truly want pickle to ignore any differences between the two versions of the class. If the two versions have significant differences, I don't expect this solution to work.

Ordway answered 9/6, 2019 at 0:37 Comment(0)
P
2

Two solutions come into my mind:

  1. before you pickle you can set object.__class__

    >>> class X(object):
        pass
    
    >>> class Y(object):
        pass
    
    >>> x = X()
    >>> x.__class__ = Y
    >>> type(x)
    <class '__main__.Y'>
    

    Maybe you can use persistent_id for this because every object is passed to it.

  2. define __reduce__ to do the exact same as pickle does. (have a look at pickle.py for this)

Paction answered 29/4, 2013 at 14:53 Comment(1)
I successfully used the first method to convert pickle files for a class that I renamed.Bega

© 2022 - 2024 — McMap. All rights reserved.