Pickling a class definition
Asked Answered
G

3

13

Is there a way to pickle a class definition?

What I'd like to do is pickle the definition (which may created dynamically), and then send it over a TCP connection so that an instance can be created on the other end.

I understand that there may be dependencies, like modules and global variables that the class relies on. I'd like to bundle these in the pickling process as well, but I'm not concerned about automatically detecting the dependencies because it's okay if the onus is on the user to specify them.

Gribble answered 13/4, 2010 at 2:39 Comment(1)
I found this which pickles the whole interpreter state: dev.pocoo.org/hg/sandbox/file/tip/pshell.py Class definitions seem to be pickled as well...Gribble
H
8

If you use dill, it enables you to treat __main__ as if it were a python module (for the most part). Hence, you can serialize interactively defined classes, and the like. dill also (by default) can transport the class definition as part of the pickle.

>>> class MyTest(object):
...   def foo(self, x):
...     return self.x * x
...   x = 4
... 
>>> f = MyTest() 
>>> import dill
>>>
>>> with open('test.pkl', 'wb') as s:
...   dill.dump(f, s)
... 
>>> 

Then shut down the interpreter, and send the file test.pkl over TCP. On your remote machine, now you can get the class instance.

Python 2.7.9 (default, Dec 11 2014, 01:21:43) 
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> with open('test.pkl', 'rb') as s:
...   f = dill.load(s)
... 
>>> f
<__main__.MyTest object at 0x1069348d0>
>>> f.x
4
>>> f.foo(2)
8
>>>             

But how to get the class definition? So this is not exactly what you wanted. The following is, however.

>>> class MyTest2(object):
...   def bar(self, x):
...     return x*x + self.x
...   x = 1
... 
>>> import dill
>>> with open('test2.pkl', 'wb') as s:
...   dill.dump(MyTest2, s)
... 
>>>

Then after sending the file… you can get the class definition.

Python 2.7.9 (default, Dec 11 2014, 01:21:43) 
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> with open('test2.pkl', 'rb') as s:
...   MyTest2 = dill.load(s)
... 
>>> print dill.source.getsource(MyTest2)
class MyTest2(object):
  def bar(self, x):
    return x*x + self.x
  x = 1

>>> f = MyTest2()
>>> f.x
1
>>> f.bar(4)
17

So, within dill, there's dill.source, and that has methods that can detect dependencies of functions and classes, and take them along with the pickle (for the most part).

>>> def foo(x):
...   return x*x
... 
>>> class Bar(object):
...   def zap(self, x):
...     return foo(x) * self.x
...   x = 3
... 
>>> print dill.source.importable(Bar.zap, source=True)
def foo(x):
  return x*x
def zap(self, x):
  return foo(x) * self.x

So that's not "perfect" (or maybe not what's expected)… but it does serialize the code for a dynamically built method and it's dependencies. You just don't get the rest of the class -- but the rest of the class is not needed in this case.

If you wanted to get everything, you could just pickle the entire session.

>>> import dill
>>> def foo(x):
...   return x*x
... 
>>> class Blah(object):
...   def bar(self, x):
...     self.x = (lambda x:foo(x)+self.x)(x)
...   x = 2
... 
>>> b = Blah()
>>> b.x
2
>>> b.bar(3)
>>> b.x
11
>>> dill.dump_session('foo.pkl')
>>> 

Then on the remote machine...

Python 2.7.9 (default, Dec 11 2014, 01:21:43) 
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> dill.load_session('foo.pkl')
>>> b.x
11
>>> b.bar(2)
>>> b.x
15
>>> foo(3)
9

Lastly, if you want the transport to be "done" for you transparently, you could use pathos.pp or ppft, which provide the ability to ship objects to a second python server (on a remote machine) or python process. They use dill under the hood, and just pass the code across the wire.

>>> class More(object):
...   def squared(self, x):
...     return x*x
... 
>>> import pathos
>>> 
>>> p = pathos.pp.ParallelPythonPool(servers=('localhost,1234',))
>>> 
>>> m = More()
>>> p.map(m.squared, range(5))
[0, 1, 4, 9, 16]

The servers argument is optional, and here is just connecting to the local machine on port 1234… but if you use the remote machine name and port instead (or as well), you'll fire off to the remote machine -- "effortlessly".

Get dill, pathos, and ppft here: https://github.com/uqfoundation

Holocrine answered 22/1, 2015 at 17:12 Comment(0)
W
6

Alas, not directly. You can send the string form of the class statement, or a bytecode form, and "rehydrate" it with an exec on the receiving end.

Wayne answered 13/4, 2010 at 2:44 Comment(2)
Are you referring to just sending the .py or .pyc files containing the class statement?Gribble
@Giorgio, you don't necessarily need to send all of the .py or .pyc file, just the source of the class itself (as identified by docs.python.org/library/inspect.html?#inspect.getsource) or its compiled form.Wayne
F
4

The documentation does a pretty good job of explaining what can and can't be pickled, and why that is.

http://docs.python.org/library/pickle.html#what-can-be-pickled-and-unpickled

Basically, if the class or module is importable by name when it gets unpickled, it should work, unless you plan on changing your class definition between now and when you unpickle it. In the class definition below, only the class name "Test", and the method name "mymethod" will be pickled. If you pickle out the class definition, then change the definition so that attr is a different value, and mymethod does something completely different, the pickle will pick up the new definition.

class Test(object):
    attr = 5

    def mymethod(self, arg):
        return arg
Fibered answered 13/4, 2010 at 7:32 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.