How to dill (pickle) to file?
Asked Answered
C

1

20

The question may seem a little basic, but wasn't able to find anything that I understood in the internet. How do I store something that I pickled with dill?

I have come this far for saving my construct (pandas DataFrame, which also contains custom classes):

import dill
dill_file = open("data/2017-02-10_21:43_resultstatsDF", "wb")
dill_file.write(dill.dumps(resultstatsDF))
dill_file.close()

and for reading

dill_file = open("data/2017-02-10_21:43_resultstatsDF", "rb")
resultstatsDF_out = dill.load(dill_file.read())
dill_file.close()

but I when reading I get the error

TypeError: file must have 'read' and 'readline' attributes

How do I do this?


EDIT for future readers: After having used this approach (to pickle my DataFrame) for while, now I refrain from doing so. As it turns out, different program versions (including objects that might be stored in the dill file) might result in not being able to recover the pickled file. Now I make sure that everything that I want to save, can be expressed as a string (as efficiently as possible) -- actually a human readable string. Now, I store my data as CSV. Objects in CSV-cells might be represented by JSON format. That way I make sure that my files will be readable in the months and years to come. Even if code changes, I am able to rewrite encoders by parsing the strings and I am able to understand the CSV my inspecting it manually.

Conflux answered 10/2, 2017 at 20:47 Comment(2)
Thanks for the edit. I'm running into similar issuesMeara
@PeterSmit: I am glad it helped! You can leave an upvote :-).Conflux
C
35

Just give it the file without the read:

resultstatsDF_out = dill.load(dill_file)

you can also dill to file like this:

with open("data/2017-02-10_21:43_resultstatsDF", "wb") as dill_file:
    dill.dump(resultstatsDF, dill_file)

So:

dill.dump(obj, open_file)

writes to a file directly. Whereas:

dill.dumps(obj) 

serializes obj and you can write it to file yourself.

Likewise:

dill.load(open_file)

reads from a file, and:

dill.loads(serialized_obj)

constructs an object form a serialized object, which you could read from a file.

It is recommended to open a file using the with statement.

Here:

with open(path) as fobj:
    # do somdthing with fobj

has the same effect as:

fobj = open(path)
try:
    # do somdthing with fobj
finally:
    fobj.close()

The file will be closed as soon as you leave the indention of the with statement, even in the case of an exception.

Carder answered 10/2, 2017 at 20:50 Comment(6)
I keep having following error: *** RecursionError: maximum recursion depth exceededPedigo
Try increasing the limit: import sys sys.setrecursionlimit(10_000)Haveman
If I am still facing with the same error, would it be wise to increase 10_000 more?Pedigo
Just try it out. If you increase too much, you program will crash. Looks like you have a recursive data structure such as a s list that contains itself. You may need to fix this.Haveman
When I add more zeros should I add _ again like: 1_000_000 ?Pedigo
@Pedigo Just like to add that this is probably nothing to to do with you code. and is instead an issue with pickle. See: #2135206Dwarfish

© 2022 - 2024 — McMap. All rights reserved.