Python: can't pickle module objects error
Asked Answered
M

5

33

I'm trying to pickle a big class and getting

TypeError: can't pickle module objects

despite looking around the web, I can't exactly figure out what this means. and I'm not sure which module object is causing the trouble. is there a way to find the culprit? the stack trace doesn't seem to indicate anything.

Malang answered 7/5, 2010 at 18:33 Comment(2)
Kinda difficult to tell without seeing the code.Candytuft
what code are you running?Heliotropin
M
22

I can reproduce the error message this way:

import cPickle

class Foo(object):
    def __init__(self):
        self.mod=cPickle

foo=Foo()
with file('/tmp/test.out', 'w') as f:
    cPickle.dump(foo, f) 

# TypeError: can't pickle module objects

Do you have a class attribute that references a module?

Mindamindanao answered 7/5, 2010 at 18:41 Comment(3)
+1. This is the only way (referencing modules) I've seen this happen (as in import os,cPickle;cPickle.dumps(os.path))Silverside
that make sense, but i'm not sure how to find it. (this class i didn't write myself, and it's 3500 lines long.) any idea how i might locate the reference? thanks!Malang
The only thing that springs to mind is recursive descent.. do a dir(...) on the object, and try to pickle each of the attributes separately. Take the one the gives the error, and repeat same until you found the module object.Bubb
C
23

Python's inability to pickle module objects is the real problem. Is there a good reason? I don't think so. Having module objects unpicklable contributes to the frailty of python as a parallel / asynchronous language. If you want to pickle module objects, or almost anything in python, then use dill.

Python 3.2.5 (default, May 19 2013, 14:25:55) 
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> import os
>>> dill.dumps(os)
b'\x80\x03cdill.dill\n_import_module\nq\x00X\x02\x00\x00\x00osq\x01\x85q\x02Rq\x03.'
>>>
>>>
>>> # and for parlor tricks...
>>> class Foo(object):
...   x = 100
...   def __call__(self, f):
...     def bar(y):
...       return f(self.x) + y
...     return bar
... 
>>> @Foo()
... def do_thing(x):
...   return x
... 
>>> do_thing(3)
103 
>>> dill.loads(dill.dumps(do_thing))(3)
103
>>> 

Get dill here: https://github.com/uqfoundation/dill

Cumulostratus answered 22/3, 2014 at 15:22 Comment(4)
When I try to load dumped object I am getting following error: *** RecursionError: maximum recursion depth exceeded Stomach
@alper: I'm assuming whatever you are experiencing is different than the OP. You should either create a new post of your own, open a new ticket on dill's GitHub. There's not enough information in your comment for anyone to help you.Cumulostratus
This occurs if the dumped dill's object is more than 1MB. I assume dill does not work efficient in large objects and crashesStomach
@alper: I haven't experienced issues with loading objects larger than 1MB. Maybe you can open a ticket on dill's GitHub, and include your version of dill, of python, and self-contained example code that reproduces what you are experiencing.Cumulostratus
M
22

I can reproduce the error message this way:

import cPickle

class Foo(object):
    def __init__(self):
        self.mod=cPickle

foo=Foo()
with file('/tmp/test.out', 'w') as f:
    cPickle.dump(foo, f) 

# TypeError: can't pickle module objects

Do you have a class attribute that references a module?

Mindamindanao answered 7/5, 2010 at 18:41 Comment(3)
+1. This is the only way (referencing modules) I've seen this happen (as in import os,cPickle;cPickle.dumps(os.path))Silverside
that make sense, but i'm not sure how to find it. (this class i didn't write myself, and it's 3500 lines long.) any idea how i might locate the reference? thanks!Malang
The only thing that springs to mind is recursive descent.. do a dir(...) on the object, and try to pickle each of the attributes separately. Take the one the gives the error, and repeat same until you found the module object.Bubb
H
12

Recursively Find Pickle Failure

Inspired by wump's comment: Python: can't pickle module objects error

Here is some quick code that helped me find the culprit recursively.

It checks the object in question to see if it fails pickling.

Then iterates trying to pickle the keys in __dict__ returning the list of only failed picklings.

Code Snippet

import pickle

def pickle_trick(obj, max_depth=10):
    output = {}

    if max_depth <= 0:
        return output

    try:
        pickle.dumps(obj)
    except (pickle.PicklingError, TypeError) as e:
        failing_children = []

        if hasattr(obj, "__dict__"):
            for k, v in obj.__dict__.items():
                result = pickle_trick(v, max_depth=max_depth - 1)
                if result:
                    failing_children.append(result)

        output = {
            "fail": obj, 
            "err": e, 
            "depth": max_depth, 
            "failing_children": failing_children
        }

    return output

Example Program

import redis

import pickle
from pprint import pformat as pf


def pickle_trick(obj, max_depth=10):
    output = {}

    if max_depth <= 0:
        return output

    try:
        pickle.dumps(obj)
    except (pickle.PicklingError, TypeError) as e:
        failing_children = []

        if hasattr(obj, "__dict__"):
            for k, v in obj.__dict__.items():
                result = pickle_trick(v, max_depth=max_depth - 1)
                if result:
                    failing_children.append(result)

        output = {
            "fail": obj, 
            "err": e, 
            "depth": max_depth, 
            "failing_children": failing_children
        }

    return output


if __name__ == "__main__":
    r = redis.Redis()
    print(pf(pickle_trick(r)))

Example Output

$ python3 pickle-trick.py
{'depth': 10,
 'err': TypeError("can't pickle _thread.lock objects"),
 'fail': Redis<ConnectionPool<Connection<host=localhost,port=6379,db=0>>>,
 'failing_children': [{'depth': 9,
                       'err': TypeError("can't pickle _thread.lock objects"),
                       'fail': ConnectionPool<Connection<host=localhost,port=6379,db=0>>,
                       'failing_children': [{'depth': 8,
                                             'err': TypeError("can't pickle _thread.lock objects"),
                                             'fail': <unlocked _thread.lock object at 0x10bb58300>,
                                             'failing_children': []},
                                            {'depth': 8,
                                             'err': TypeError("can't pickle _thread.RLock objects"),
                                             'fail': <unlocked _thread.RLock object owner=0 count=0 at 0x10bb58150>,
                                             'failing_children': []}]},
                      {'depth': 9,
                       'err': PicklingError("Can't pickle <function Redis.<lambda> at 0x10c1e8710>: attribute lookup Redis.<lambda> on redis.client failed"),
                       'fail': {'ACL CAT': <function Redis.<lambda> at 0x10c1e89e0>,
                                'ACL DELUSER': <class 'int'>,
0x10c1e8170>,
                                .........
                                'ZSCORE': <function float_or_none at 0x10c1e5d40>},
                       'failing_children': []}]}

Root Cause - Redis can't pickle _thread.lock

In my case, creating an instance of Redis that I saved as an attribute of an object broke pickling.

When you create an instance of Redis it also creates a connection_pool of Threads and the thread locks can not be pickled.

I had to create and clean up Redis within the multiprocessing.Process before it was pickled.

Testing

In my case, the class that I was trying to pickle, must be able to pickle. So I added a unit test that creates an instance of the class and pickles it. That way if anyone modifies the class so it can't be pickled, therefore breaking it's ability to be used in multiprocessing (and pyspark), we will detect that regression and know straight away.

def test_can_pickle():
    # Given
    obj = MyClassThatMustPickle()

    # When / Then
    pkl = pickle.dumps(obj)

    # This test will throw an error if it is no longer pickling correctly

Hennessey answered 21/1, 2020 at 0:22 Comment(0)
S
4

According to the documentation:

What can be pickled and unpickled?

The following types can be pickled:

  • None, True, and False
  • integers, floating point numbers, complex numbers
  • strings, bytes, bytearrays
  • tuples, lists, sets, and dictionaries containing only picklable objects
  • functions defined at the top level of a module (using def, not lambda)
  • built-in functions defined at the top level of a module
  • classes that are defined at the top level of a module
  • instances of such classes whose __dict__ or the result of calling __getstate__() is picklable (see section Pickling Class Instances for details).

As you can see, modules are not part of this list. Note, that this is also true when using deepcopy and not only for the pickle module, as stated in the documentation of deepcopy:

This module does not copy types like module, method, stack trace, stack frame, file, socket, window, array, or any similar types. It does “copy” functions and classes (shallow and deeply), by returning the original object unchanged; this is compatible with the way these are treated by the pickle module.

A possible workaround is using the @property decorator instead of an attribute. For example, this should work:

    import numpy as np
    import pickle
    
    class Foo():
        @property
        def module(self):
            return np
    
    foo = Foo()
    with open('test.out', 'wb') as f:
        pickle.dump(foo, f)


 
Swayback answered 21/9, 2019 at 13:39 Comment(0)
U
1

@Flask 2.x users with "TypeError: can't pickle module objects"

If you have this error when trying to display your model using @dataclass decorator, ensure you are using lazy='joined' in your db.relationship()

Unquote answered 5/9, 2023 at 13:50 Comment(1)
First off, nice answer, so I upvoted it. Nonetheless I have a bit of style related feedback. Personally, I'd start off clarifying that this is added for future reference for those using Flask 2+ who are getting "TypeError: can't pickle module objects" followed by your provided solution. The reason for this is that in my experience these answers are easier when using a web search for a specific problem (i.e. a web search query "TypeError: can't pickle module objects with Flask").Antilogy

© 2022 - 2024 — McMap. All rights reserved.