serializing and deserializing lambdas
Asked Answered
V

3

17

I would like to serialize on machine A and deserialize on machine B a python lambda. There are a couple of obvious problems with that:

  • the pickle module does not serialize or deserialize code. It only serializes the names of classes/methods/functions
  • some of the answers I found with google suggest the use of the low-level marshal module to serialize the func_code attribute of the lambda but they fail to describe how one could reconstruct a function object from the deserialized code object
  • marhshal(l.func_code) will not serialize the closure associated with the lambda which leads to the problem of detecting when a given lambda really needs a closure and warning the user that he is trying to serialize a lambda that uses a closure

Hence, my question(s):

  • how would one reconstruct a function from the deserialized (demarshaled) code object ?
  • how would one detect that a given lambda will not work properly without the associated closure ?
Vena answered 9/8, 2012 at 7:2 Comment(0)
C
22

Surprisingly, checking whether a lambda will work without its associated closure is actually fairly easy. According to the data model documentation, you can just check the func_closure attribute:

>>> def get_lambdas():
...     bar = 42
...     return (lambda: 1, lambda: bar)
...
>>> no_vars, vars = get_lambdas()
>>> print no_vars.func_closure
None
>>> print vars.func_closure
(<cell at 0x1020d3d70: int object at 0x7fc150413708>,)
>>> print vars.func_closure[0].cell_contents
42
>>>

Then serializing + loading the lambda is fairly straight forward:

>>> import marshal, types
>>> old = lambda: 42
>>> old_code_serialized = marshal.dumps(old.func_code)
>>> new_code = marshal.loads(old_code_serialized)
>>> new = types.FunctionType(new_code, globals())
>>> new()
42

It's worth taking a look at the documentation for the FunctionType:

function(code, globals[, name[, argdefs[, closure]]])

Create a function object from a code object and a dictionary.
The optional name string overrides the name from the code object.
The optional argdefs tuple specifies the default argument values.
The optional closure tuple supplies the bindings for free variables.

Notice that you can also supply a closure… Which means you might even be able to serialize the old function's closure then load it at the other end :)

Chattel answered 9/8, 2012 at 7:12 Comment(10)
there is a typo in the last print statement. finc_closure should be func_closureVena
I do not know about other versions of python but the one I am using (2.7.3) is not able to serialize the closure (either with marshal or pickle) or to print its content like you did with 'print vars.func_closure[0]'.Vena
D'oh! That's because you need to print x.func_closure[0].cell_contents — I've updated the answer now. Are you still having trouble with serializing the lambda's func_code?Chattel
yes, cell_contents is simply not defined in the version of python that is running on my system. I eventually gave up and made the lambdas class-based functors.Vena
Ah, weird. Which version of Python are you using?Chattel
2.7.3, the default one from ubuntuVena
Wow, weird. I used 2.7 to test the above. If you're interested in looking into it, you can use c = x.func_closure[0], then dir(c) to see what attributes are available on c.Chattel
oh, wow. It works now. I wonder what I could possibly have done wrong... Thanks again!Vena
Just a note for others using python 3.0+. They've changed the name of func_closure to __closure__. See docs.python.org/3.0/whatsnew/… for more info.Tobin
also for python 3.0+ func_code was changed to codeBlowfly
F
4

I'm not sure exactly what you want to do, but you could try dill. Dill can serialize and deserialize lambdas and I believe also works for lambdas inside closures. The pickle API is a subset of it's API. To use it, just "import dill as pickle" and go about your business pickling stuff.

>>> import dill
>>> testme = lambda x: lambda y:x
>>> _testme = dill.loads(dill.dumps(testme))
>>> testme
<function <lambda> at 0x1d92530>
>>> _testme
<function <lambda> at 0x1d924f0>
>>> 
>>> def complicated(a,b):
...   def nested(x):
...     return testme(x)(a) * b
...   return nested
... 
>>> _complicated = dill.loads(dill.dumps(complicated))
>>> complicated 
<function complicated at 0x1d925b0>
>>> _complicated
<function complicated at 0x1d92570>

Dill registers it's types into the pickle registry, so if you have some black box code that uses pickle and you can't really edit it, then just importing dill can magically make it work without monkeypatching the 3rd party code. Or, if you want the whole interpreter session sent over the wire as an "python image", dill can do that too.

>>> # continuing from above
>>> dill.dump_session('foobar.pkl')
>>>
>>> ^D
dude@sakurai>$ python
Python 2.7.5 (default, Sep 30 2013, 20:15:49) 
[GCC 4.2.1 (Apple Inc. build 5566)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> dill.load_session('foobar.pkl')
>>> testme(4)
<function <lambda> at 0x1d924b0>
>>> testme(4)(5)
4
>>> dill.source.getsource(testme)
'testme = lambda x: lambda y:x\n'

You can easily send the image across ssh to another computer, and start where you left off there as long as there's version compatibility of pickle and the usual caveats about python changing and things being installed. As shown, you can also extract the source of the lambda that was defined in the previous session.

Dill also has some good tools for helping you understand what is causing your pickling to fail when your code fails.

Fillian answered 13/5, 2013 at 21:39 Comment(0)
W
0

I wrote a library called msgpickle (pip install). The reason i wrote it is because i wanted a pickler where i could easily control what can and what cannot be pickled.

So, while pickling lambdas is unsafe, you can enable it as needed and create new picklers for any class and build a serialization strategy that is sound.

The basis of it is msgpack, and it uses that for it's default serializer.

In order to serialize lambdas, you can do this:

serializer = msgpickle.MsgPickle()
serializer.register(*msgpickle.cloud_function_serializer)

now your serializer supports:

dat = serializer.dumps(lambda: 99)
fun = serializer.loads(dat)
assert fun() == 99

The core of it is the function-packer, which just uses python's code object:

def cloud_func_pack(obj: Any) -> Any:
    code_obj = obj.__code__
    # this has a chance of working for future versions of Python
    xmap = {"codestring": "code", "constants": "consts"}
    code_arg_names = [
        "co_" + xmap.get(param.name, param.name) for param in code_type_params.values()
    ]

    def convert(value: Any) -> Any:
        if isinstance(value, tuple):
            return list(value)
        return value

    code_attributes = [convert(getattr(code_obj, attr)) for attr in code_arg_names]
    return code_attributes

That means it's unsafe across python versions, but that's the same as any other pickler.

The difference is the simplicity and explicitness.

Wenwenceslaus answered 20/3 at 17:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.