How to read pickle file?
Asked Answered
K

5

192

I created some data and stored it several times like this:

with open('filename', 'a') as f:
        pickle.dump(data, f)

Every time the size of file increased, but when I open file

with open('filename', 'rb') as f:
    x = pickle.load(f)

I can see only data from the last time. How can I correctly read file?

Kreit answered 28/1, 2016 at 17:22 Comment(5)
You are appending objects to your file. When you unpicke you unpickle only first entry. Are you sure you need all those entries? If not - change to open('filename', 'wb')Tailspin
Yes, I need all entries. The size of files shows that it contains all of it.Kreit
Then @jsbueno is right in his answer.Tailspin
See also: How can I use pickle to save a dict?Nitriding
I built something to view pickle files directly in your browser: pickleviewer.comTorrens
F
165

Pickle serializes a single object at a time, and reads back a single object - the pickled data is recorded in sequence on the file.

If you simply do pickle.load you should be reading the first object serialized into the file (not the last one as you've written).

After unserializing the first object, the file-pointer is at the beggining of the next object - if you simply call pickle.load again, it will read that next object - do that until the end of the file.

objects = []
with (open("myfile", "rb")) as openfile:
    while True:
        try:
            objects.append(pickle.load(openfile))
        except EOFError:
            break
Footpoundsecond answered 28/1, 2016 at 17:29 Comment(0)
N
111

There is a read_pickle function as part of pandas 0.22+

import pandas as pd

obj = pd.read_pickle(r'filepath')
Nitrobenzene answered 24/1, 2020 at 15:55 Comment(1)
Are there any performance or compatibility differences between pd.read_pickle and pickle.load?Oersted
N
8

The following is an example of how you might write and read a pickle file. Note that if you keep appending pickle data to the file, you will need to continue reading from the file until you find what you want or an exception is generated by reaching the end of the file. That is what the last function does.

import os
import pickle


PICKLE_FILE = 'pickle.dat'


def main():
    # append data to the pickle file
    add_to_pickle(PICKLE_FILE, 123)
    add_to_pickle(PICKLE_FILE, 'Hello')
    add_to_pickle(PICKLE_FILE, None)
    add_to_pickle(PICKLE_FILE, b'World')
    add_to_pickle(PICKLE_FILE, 456.789)
    # load & show all stored objects
    for item in read_from_pickle(PICKLE_FILE):
        print(repr(item))
    os.remove(PICKLE_FILE)


def add_to_pickle(path, item):
    with open(path, 'ab') as file:
        pickle.dump(item, file, pickle.HIGHEST_PROTOCOL)


def read_from_pickle(path):
    with open(path, 'rb') as file:
        try:
            while True:
                yield pickle.load(file)
        except EOFError:
            pass


if __name__ == '__main__':
    main()
Nuggar answered 28/1, 2016 at 17:38 Comment(0)
T
6

I developed a software tool that opens (most) Pickle files directly in your browser (nothing is transferred so it's 100% private):

https://pickleviewer.com/ (formerly)

Now it's hosted here: https://fire-6dcaa-273213.web.app/

Edit: Available here if you want to host it somewhere: https://github.com/ch-hristov/Pickle-viewer

Feel free to host this somewhere.

Torrens answered 6/4, 2020 at 15:34 Comment(4)
Is this still active? I tried to load it, but no dice...Duwalt
@Duwalt hey, sorry it's down. I updated the answer, feel free to host this somewhere. Should be free to go. It's also available here: fire-6dcaa-273213.web.appTorrens
My output: 100% private, nothing is transferred. You can convert the file to JSON after opening it. This software only opens files created with Python 3.3 or less Contribute to development by supporting us on Patreon Sorry, we couldn't open your file. :( The following error occurred Unhandled pickle protocol version: 4Clevis
@Clevis newer protocols aren't supported (only <= 3). You can check out the github repo for more infoTorrens
R
0

You can also use joblib to read pickle files. It is especially useful if you were reading pickled scikit-learn models or numpy ndarray objects (joblib comes with scikit-learn and is specifically designed to handle numpy ndarrays).

import joblib
x = joblib.load("my_file.pkl")

Then again, both joblib and pandas use the pickle.load from the standard library, so in reality, both are almost the same as:

with open("my_file.pkl", "rb") as f:
    x = pickle.load(f)

It's just that file handling and some backward compatibility considerations are handled under the hood in pandas and joblib.

In particular, for the OP's specific case, they cannot work and must use the same try-except block to read all objects, e.g.:

objects = []
with open("myfile", "rb") as openfile:
    while True:
        try:
            objects.append(pd.read_pickle(openfile))
        except EOFError:
            break
Realize answered 2/3 at 10:23 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.