What causes the error "_pickle.UnpicklingError: invalid load key, ' '."?
Asked Answered
G

13

86

I'm trying to store 5000 data elements on an array. This 5000 elements are stored on an existinng file (therefore it's not empty).

But I'm getting an error.

Code

def array():
    name = 'puntos.df4'

    m = open(name, 'rb')
    v = []*5000

    m.seek(-5000, io.SEEK_END)
    fp = m.tell()
    sz = os.path.getsize(name)

    while fp < sz:
        pt = pickle.load(m)
        v.append(pt)

    m.close()
    return v

Output:

line 23, in array
pt = pickle.load(m)
_pickle.UnpicklingError: invalid load key, ''.
Gynaecology answered 10/10, 2015 at 2:40 Comment(3)
maybe i'm missing something, but it looks like you're assuming each value has a size of a single byte, why do you think this is guaranteed? and why are you trying to unpickle individual values manually? was the file created using the pickle module?Saucepan
Right, i didn't notice but if I remove the "m.seek(-5000, io.SEEK_END)" part i got an EOFError. I thought that solved it but now you mention that I'm more confused. Should I edit the Question?Gynaecology
Oh and yes, the file was created using the dump() function from the pickle moduleGynaecology
J
29

pickling is recursive, not sequential. Thus, to pickle a list, pickle will start to pickle the containing list, then pickle the first element… diving into the first element and pickling dependencies and sub-elements until the first element is serialized. Then moves on to the next element of the list, and so on, until it finally finishes the list and finishes serializing the enclosing list. In short, it's hard to treat a recursive pickle as sequential, except for some special cases. It's better to use a smarter pattern on your dump, if you want to load in a special way.

The most common pickle, it to pickle everything with a single dump to a file -- but then you have to load everything at once with a single load. However, if you open a file handle and do multiple dump calls (e.g. one for each element of the list, or a tuple of selected elements), then your load will mirror that… you open the file handle and do multiple load calls until you have all the list elements and can reconstruct the list. It's still not easy to selectively load only certain list elements, however. To do that, you'd probably have to store your list elements as a dict (with the index of the element or chunk as the key) using a package like klepto, which can break up a pickled dict into several files transparently, and enables easy loading of specific elements.

Saving and loading multiple objects in pickle file?

Juba answered 10/10, 2015 at 12:32 Comment(0)
G
41

I solved my issue by:

  • Remove the cloned project
  • Install git lfs: sudo apt-get install git-lfs
  • Set up git lfs for your user account: git lfs install
  • Clone the project again.
Gujranwala answered 17/4, 2021 at 17:58 Comment(5)
Just to add a comment. I downloaded a large file from github -- and encounter the error message "_pickle.UnpicklingError: invalid load key". Later I found the large file is broken. I need to download it again and make sure the sha256sum is the sameNovelty
True for machine learning models that are being downloaded from HuggingFace or other resource.Rumpf
Thanks for the tip about git lfs! In my case, sudo apt-get install git-lfs, git lfs install, and git lfs pull was enough.Apps
This does not seem to have anything to do with the problem in the question.Joppa
and yet it has saved me at least twice!Haply
J
29

pickling is recursive, not sequential. Thus, to pickle a list, pickle will start to pickle the containing list, then pickle the first element… diving into the first element and pickling dependencies and sub-elements until the first element is serialized. Then moves on to the next element of the list, and so on, until it finally finishes the list and finishes serializing the enclosing list. In short, it's hard to treat a recursive pickle as sequential, except for some special cases. It's better to use a smarter pattern on your dump, if you want to load in a special way.

The most common pickle, it to pickle everything with a single dump to a file -- but then you have to load everything at once with a single load. However, if you open a file handle and do multiple dump calls (e.g. one for each element of the list, or a tuple of selected elements), then your load will mirror that… you open the file handle and do multiple load calls until you have all the list elements and can reconstruct the list. It's still not easy to selectively load only certain list elements, however. To do that, you'd probably have to store your list elements as a dict (with the index of the element or chunk as the key) using a package like klepto, which can break up a pickled dict into several files transparently, and enables easy loading of specific elements.

Saving and loading multiple objects in pickle file?

Juba answered 10/10, 2015 at 12:32 Comment(0)
S
28

This may not be relevant to your specific issue, but I had a similar problem when the pickle archive had been created using gzip.

For example if a compressed pickle archive is made like this,

import gzip, pickle
with gzip.open('test.pklz', 'wb') as ofp:
    pickle.dump([1,2,3], ofp)

Trying to open it throws the errors

 with open('test.pklz', 'rb') as ifp:
     print(pickle.load(ifp))
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
_pickle.UnpicklingError: invalid load key, ''.

But, if the pickle file is opened using gzip all is harmonious

with gzip.open('test.pklz', 'rb') as ifp:
    print(pickle.load(ifp))

[1, 2, 3]
Sinh answered 7/6, 2017 at 20:32 Comment(1)
Hmmm. Can you look at this question? There is a similar error, but the circumstances are different.Circumcise
B
15

If you transferred these files through disk or other means, it is likely they were not saved properly.

Burgenland answered 14/11, 2018 at 17:47 Comment(1)
This happened to me. I was sharing memory between two processes and used pickle to continuously read/write data from/to that memory. Unfortunately, I had forgotten to use a lock, so I ended up having a race condition where pickle.loads() sometimes failed because the data was corrupted (i.e. it was read while it was being overridden by the other process).Valediction
I
4

I received a similar error while loading a pickled sklearn model. The problem was that the pickle is created via sklearn.externals.joblib and i was trying to load it via standard pickle library. Using joblib has solved my problem.

Inning answered 20/4, 2021 at 15:26 Comment(0)
S
3

I am not completely sure what you're trying to achieve by seeking to a specific offset and attempting to load individual values manually, the typical usage of the pickle module is:

# save data to a file
with open('myfile.pickle','wb') as fout:
    pickle.dump([1,2,3],fout)

# read data from a file
with open('myfile.pickle') as fin:
    print pickle.load(fin)

# output
>> [1, 2, 3]

If you dumped a list, you'll load a list, there's no need to load each item individually.

you're saying that you got an error before you were seeking to the -5000 offset, maybe the file you're trying to read is corrupted.

If you have access to the original data, I suggest you try saving it to a new file and reading it as in the example.

Saucepan answered 10/10, 2015 at 3:14 Comment(2)
The file contains 5000 lists. I was trying to store each list in every component of the array.Gynaecology
Hi, I could solve the problem. I'm not sure how, but I just removed the "fp" variable and the I put "while m.tell() < sz:" instead of "while fp < sz:". Thank you anyway :), and if you know the reason of this "solution" I would be thankful if you could explain it to me.Gynaecology
T
0

I had a similar error but with different context when I uploaded a *.p file to Google Drive. I tried to use it later in a Google Colab session, and got this error:

    1 with open("/tmp/train.p", mode='rb') as training_data:
----> 2     train = pickle.load(training_data)
UnpicklingError: invalid load key, '<'.

I solved it by compressing the file, upload it and then unzip on the session. It looks like the pickle file is not saved correctly when you upload/download it so it gets corrupted.

Thine answered 26/10, 2020 at 1:9 Comment(1)
I also had this issue. It turned out that the file that was meant to be downloaded from Dropbox didn't exist anymore, and that < is part of the HTML website downloaded instead, saying something like "Sorry, this file has been deleted".Lunula
B
0

I just encountered that issue which was initiated by the bad pickle file (not fully copied).

My solution: Check the pickle file status (corrupted or not).

Brewington answered 1/12, 2021 at 4:48 Comment(1)
Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.Kolnos
B
0

In my case, I ran into this issue due to multiple processes trying to read from the same pickled file. The first of these actually creates a pickle (write operation) and some quick threads start reading from it too soon. Just by retrying the read when catching these 2 errors EOFError, UnpicklingError I don't see these errors anymore

Bankston answered 23/11, 2022 at 22:18 Comment(0)
S
-1

I had this error from very "simple" mistake, I was saving my pickle wrongly, using to_csv instead of to_pickle:

import pandas as pd

#led to error:
df.to_csv('my_file.pkl')

#correct way:
df.to_pickle('my_file.pkl')
Shing answered 19/9, 2023 at 8:26 Comment(0)
M
-2
  1. Close the opened file

    filepath = 'model_v1.pkl' with open(filepath, 'rb') as f: p = cPickle.Unpickler(f) model = p.load() f.close()

  2. If step 1 doesn't work; restart the session

Millinery answered 9/3, 2022 at 1:25 Comment(4)
what is the use of using the close() function to close the file when you are opening it with 'with' statement??? This answer dosent help..Ka
what error you are getting when you are executing the above code ?Millinery
getting a 'close() function is useless here' errorKa
@Ka standard syntax to be followed. Developer will understand the syntax is multiline.Millinery
P
-2

Simple check: incorrect file extension to load_pickle() in pandas.

For example, if your file is called my_pickle.pkl on disc, but you call pd.load_pickle(my_pickle.csv) by mistake, you'll get this error.

Pendergast answered 5/10, 2023 at 18:39 Comment(1)
As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.Seacoast
D
-3

Pickling error - _pickle.UnpicklingError: invalid load key, '<'.
This kind of error comes when Weights are complete or some problem with the Weights/ Pickle file because of which UnPickling of weights giving Error.

Diacritical answered 25/7, 2022 at 6:8 Comment(1)
Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.Kolnos

© 2022 - 2024 — McMap. All rights reserved.