Python 3.5 dill pickling/unpickling on different servers: "KeyError: 'ClassType'"
Asked Answered
C

2

10

See updates at the bottom

--

A similar question was asked here, but never resolved: pickling and unpickling user-defined class

I'm working on a project which necessitates pickling user defined classes, and sending them to a remote server where they are unpickled and called. We use the Dill library to accomplish this, and have had a lot of success.

Unfortunately, I've run into an issue I'm having a hard time debugging. I create and pickle a class as follows:

import dill, base64
import time, random

class periodicSource(object):
    def __call__(self):
        while True:
            time.sleep(0.1)
            yield random.uniform(20,100)

periodic_src = periodicSource()
a = base64.b64encode(dill.dumps(periodic_src)).decode("ascii")
print(a)

It creates an ascii representation of the dilled class.

gANjZGlsbC5kaWxsCl9jcmVhdGVfdHlwZQpxAChjZGlsbC5kaWxsCl9sb2FkX3R5cGUKcQFYCQAAAENsYXNzVHlwZXEChXEDUnEEWA4AAABwZXJpb2RpY1NvdXJjZXEFaAFYBgAAAG9iamVjdHEGhXEHUnEIhXEJfXEKKFgIAAAAX19jYWxsX19xC2NkaWxsLmRpbGwKX2NyZWF0ZV9mdW5jdGlvbgpxDChoAVgIAAAAQ29kZVR5cGVxDYVxDlJxDyhLAUsASwFLA0tjQyl4IgB0AABqAQBkAQCDAQABdAIAagMAZAIAZAMAgwIAVgFxAwBXZAAAU3EQKE5HP7mZmZmZmZpLFEtkdHERKFgEAAAAdGltZXESWAUAAABzbGVlcHETWAYAAAByYW5kb21xFFgHAAAAdW5pZm9ybXEVdHEWWAQAAABzZWxmcReFcRhYHwAAADxpcHl0aG9uLWlucHV0LTIwLTdhNGU5MDIwYWM2Yz5xGWgLSwdDBgABAwENAXEaKSl0cRtScRx9cR0oWAYAAAByYW5kb21xHmNkaWxsLmRpbGwKX2ltcG9ydF9tb2R1bGUKcR9oFIVxIFJxIVgEAAAAdGltZXEiaB9YBAAAAHRpbWVxI4VxJFJxJXVoC05OfXEmdHEnUnEoWAoAAABfX21vZHVsZV9fcSlYCAAAAF9fbWFpbl9fcSpYBwAAAF9fZG9jX19xK05YDQAAAF9fc2xvdG5hbWVzX19xLF1xLXV0cS5ScS8pgXEwLg==

When I go to deserialize it on the other server:

a = 'gANjZGlsbC5kaWxsCl9jcmVhdGVfdHlwZQpxAChjZGlsbC5kaWxsCl9sb2FkX3R5cGUKcQFYCQAAAENsYXNzVHlwZXEChXEDUnEEWA4AAABwZXJpb2RpY1NvdXJjZXEFaAFYBgAAAG9iamVjdHEGhXEHUnEIhXEJfXEKKFgIAAAAX19jYWxsX19xC2NkaWxsLmRpbGwKX2NyZWF0ZV9mdW5jdGlvbgpxDChoAVgIAAAAQ29kZVR5cGVxDYVxDlJxDyhLAUsASwFLA0tjQyl4IgB0AABqAQBkAQCDAQABdAIAagMAZAIAZAMAgwIAVgFxAwBXZAAAU3EQKE5HP7mZmZmZmZpLFEtkdHERKFgEAAAAdGltZXESWAUAAABzbGVlcHETWAYAAAByYW5kb21xFFgHAAAAdW5pZm9ybXEVdHEWWAQAAABzZWxmcReFcRhYHwAAADxpcHl0aG9uLWlucHV0LTIwLTdhNGU5MDIwYWM2Yz5xGWgLSwdDBgABAwENAXEaKSl0cRtScRx9cR0oWAYAAAByYW5kb21xHmNkaWxsLmRpbGwKX2ltcG9ydF9tb2R1bGUKcR9oFIVxIFJxIVgEAAAAdGltZXEiaB9YBAAAAHRpbWVxI4VxJFJxJXVoC05OfXEmdHEnUnEoWAoAAABfX21vZHVsZV9fcSlYCAAAAF9fbWFpbl9fcSpYBwAAAF9fZG9jX19xK05YDQAAAF9fc2xvdG5hbWVzX19xLF1xLXV0cS5ScS8pgXEwLg=='
a = dill.loads(base64.b64decode(a.encode()))
print(a)

I get the following error:

/home/streamsadmin/anaconda3/bin/python /home/streamsadmin/git/streamsx.topology/test/python/topology/deleteme2.py

Traceback (most recent call last):
 File "/home/streamsadmin/git/streamsx.topology/test/python/topology/deleteme2.py", line 40, in <module>
   a = dill.loads(base64.b64decode(a.encode()))
 File "/home/streamsadmin/anaconda3/lib/python3.5/site-packages/dill/dill.py", line 277, in loads
   return load(file)
 File "/home/streamsadmin/anaconda3/lib/python3.5/site-packages/dill/dill.py", line 266, in load
   obj = pik.load()
 File "/home/streamsadmin/anaconda3/lib/python3.5/site-packages/dill/dill.py", line 524, in _load_type
   return _reverse_typemap[name]
KeyError: 'ClassType'

I would expect this if I were using different version of Python on the remote system, but they're the same:

Server 1:

>>> import sys
>>> sys.version
'3.5.2 |Anaconda 4.1.1 (64-bit)| (default, Jul  2 2016, 17:53:06) \n[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]'

Server 2:

>>> import sys
>>> sys.version
'3.5.2 |Anaconda 4.2.0 (64-bit)| (default, Jul  2 2016, 17:53:06) \n[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]'

Additionally, both versions of Dill are 0.2.6. Any ideas how I could debug this?

EDIT: I've think it might be something with my environment. I'm using Python 3.5, but list the builtin types:

>>> import types
>>> dir(types)
  ['BuiltinFunctionType',
   'BuiltinMethodType',
   'ClassType',
   'CodeType',
   ...
  ]

It seems ClassType is in the output which should NOT be the case since ClassType was removed in Python 3.5. This is exceedingly strange.

I'm running on a system that has both Python 2.7 and Python 3.5 installed. Could the 2.7 installation somehow be polluting the 3.5 installation?

Cellobiose answered 22/3, 2017 at 19:9 Comment(16)
I'm not familiar with Dill, but with pickle you need to have all functions (and I think classes) available when you unpickle as it just stores references to where those are.Puke
Pickle does that. Dill manages to serialize dependencies along with the class, so that's not the issue.Cellobiose
If I do this on my linux mint with python 3.4, I get a different string but it works. if I use your string, I get the same error you do. My encoded string is too big to post here, so see pastebin.com/Pu6MCkhgFlin
The string you posted decodes successfully: "<__main__.periodicSource object at 0x7f600282b5c0>" But the question remains of why the one server is serializing differently.Cellobiose
Hm, I just realized that the versions of Anaconda are different. One is Anaconda 4.2.0, and the other is Anaconda 4.1.1. Although, the versions of Dill are still the same at 0.2.6. I'm downloading Anaconda 4.1.1 now to see if their versions of dill are, in fact, compatible.Cellobiose
Nope. The issue still persists when deserializing with Anaconda 4.1.1.Cellobiose
Hi, I'm the dill author. I am not seeing your error -- however I'm using the latest dill and python 3.5.3 everywhere. Actually, I've never seen the error you are seeing... but the traceback says that the ClassType is not registered in the typemap... which is super weird.Ancier
Unpickling errors are harder to debug than pickling errors. So, I suggest two things for a start. (1) add dill.extend(True) before the dill.loads, just to ensure that all dill types are loaded into the registry, and (2) add dill.detect.trace(True) before the dill.dumps to see what exactly is going into the serialization.Ancier
A third suggestion is to open an issue on the dill GitHub page. This seems like it might take some back-and-forth, and that's probably a better venue for that type of conversation than SO.Ancier
Another note: you can test it it's an issue of code differences between the first and second server by simply saving the serialized string to your clipboard... starting a new session on the first server, and pasting the string into the session. If it works (using two sessions and one server), then it's an issue of some difference between installs on the servers (or something like that).Ancier
Hi Mike, thank you for the advice. I'll create a GH issue on the dill page and follow up here with the results.Cellobiose
@MikeMcKerns, I've determined it's probably not a Dill issue. Take a look at the top-level edit to this issue. It seems that, somehow, ClassType is finding it's way into the types __dict__ and confusing dill.Cellobiose
That could definitely be the cause... it's a bit unexpected. Note that in the pickle registry, dill uses ClassType = type in python 3.x... so you could re-register the function in the dispatch table manually if you need to. However, I'd look to finding why it's kicked out in the first place.Ancier
@MikeMcKerns, Eureka! That did it. I added dill.dill._reverse_typemap['ClassType'] = type just before deserialization. Hm, I don't think dill actually does define ClassType anywhere. In dill 0.2.6 in dill.py at line 486 where _reverse_typemap is defined, I don't see a ClassType key.Cellobiose
So breaking it down -- Server A has types.ClassType defined, and server B does not. Server A serializes an object and passes it to server B. On deserialization, server B complains about ClassType not being defined in the _reverse_typemap. A workaround is to manually set dill.dill._reverse_typemap['ClassType'] = type before deserialization on server B.Cellobiose
You should answer your own question below.Ancier
C
19

The culprit is cloudpickle. By default in Python 3.5, types.ClassType is left unset.

>>> import types
>>> dir(types)
['BuiltinFunctionType', 'BuiltinMethodType', 'CodeType', ...]

When cloudpickle is imported, suddenly, types.ClassType becomes defined.

>>> import cloudpickle
>>> dir(types)
['BuiltinFunctionType', 'BuiltinMethodType', 'ClassType', 'CodeType', ...]

Server A uses dill to serialize objects, and also imports cloudpickle. Therefore it includes a reference to ClassType during serialization.

Server B does NOT import cloudpickle, and then tries to find a reference to ClassType during deserialization and fails. Raising the error:

Traceback (most recent call last):
 File "/home/streamsadmin/git/streamsx.topology/test/python/topology/deleteme2.py", line 40, in <module>
   a = dill.loads(base64.b64decode(a.encode()))
 File "/home/streamsadmin/anaconda3/lib/python3.5/site-packages/dill/dill.py", line 277, in loads
   return load(file)
 File "/home/streamsadmin/anaconda3/lib/python3.5/site-packages/dill/dill.py", line 266, in load
   obj = pik.load()
 File "/home/streamsadmin/anaconda3/lib/python3.5/site-packages/dill/dill.py", line 524, in _load_type
   return _reverse_typemap[name]
KeyError: 'ClassType'

On our system, we can't remove cloudpickle from our environment, so we had to do the following workaround.

On server B, right after we import dill and sometime before the first call to dill.loads, we invoke the following line of code:

dill._dill._reverse_typemap['ClassType'] = type

This defines ClassType appropriately. And causes dill deserialization to work as expected.

Cellobiose answered 24/3, 2017 at 17:48 Comment(1)
Thank you for this. A minor change: dill._dill._reverse_typemap['ClassType'] = typeStorybook
W
2

I am sure cloudpickle is causing the problem. You can debug it step by step.

  1. First Check if classType Exists in your builtin types

    import types dir(types)

if it exist than it should have worked for you, if not than move to next steps.

  1. import cloudpickle and now check again. You will have classType in buildin types

  2. excute below code

    dill.dill._reverse_typemap['ClassType'] = type

it should work for you :)

But if you are still getting error AttributeError: module 'dill' has no attribute 'dill'

than use this one dill._dill._reverse_typemap['ClassType'] = type because dill.dill is moved to dill._dill

Wilcher answered 6/9, 2018 at 10:25 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.