Cannot load file containing pickled data - Python .npy I/O
Asked Answered
G

14

32

I am trying to save a dataframe and a matrix as .npy files with np.save() and then read them using np.load() but I get the following error:

  File "/Users/sofiafarina/opt/anaconda3/lib/python3.7/site-packages/numpy/lib/npyio.py", line 457, in load
    raise ValueError("Cannot load file containing pickled data "

ValueError: Cannot load file containing pickled data when allow_pickle=False

Even if I write allow_pickle=True I get an error:

  File "/Users/sofiafarina/opt/anaconda3/lib/python3.7/site-packages/numpy/lib/npyio.py", line 463, in load
    "Failed to interpret file %s as a pickle" % repr(file))

OSError: Failed to interpret file 'finaldf_p_85_12.npy' as a pickle

So how could I save a df from a python script and then load it in another one? Should I use other functions? Thank you!

Gardiner answered 12/2, 2020 at 15:25 Comment(1)
show the save commands. Doesn't pandas have its own version of save and load?Pericycle
I
16

TLDR;

After hundreds of search and hours of debugging I found out that the issue was with git-lfs, my files did not get pulled using git-lfs.

git lfs install
git lfs pull

I think numpy needs to report this correctly


I had the exact same issue. dtype in my .npz file was uint8, so not an Object, technically allow_pickle should not be required. My numpy version is 1.20.x

Got the following when using allow_pickle=False

ValueError: Cannot load file containing pickled data when allow_pickle=False

And with allow_pickle=True I got

OSError: Failed to interpret file 'finaldf_p_85_12.npy' as a pickle

Ingold answered 11/5, 2021 at 3:14 Comment(2)
same thing happend to me. The .npy file was corrupted, I had to remake the repository to reupload the git lfs file (.npy and .kv)Pinole
Could somebody explain this more? I have been given a .npz file by someone else. I'm not sure what to do exactly. I don't understand how git has anything to do with this. Thanks!Carousel
M
9

I used the syntax below to load the .npy file and it worked.

np.load("finaldf_p_85_12.npy",allow_pickle=True)

I think you need to add allow_pickle=True parameter.

Mammon answered 26/5, 2020 at 9:32 Comment(2)
She already stated in her question, that this will prompt an OSError in her case.Lutero
Setting allow_pickle=True isn't necessarily going to work. In my case, it raised a pickle.UnpicklingError saying it could not interpret the file.Ellord
L
6

The existing answers are all useful. Just a note that I just got this error from someone else's pickle file when the file itself was corrupted. As above, allow_pickle=False complained about pickle being disabled, and allow_pickle=True complained about it not being a valid pickle file. The fix was just redownloading the file in my case.

Lothaire answered 22/4, 2021 at 1:41 Comment(1)
Indeed the best answer applying to my caseDevotee
A
3

I was dealing with the problem long time. I have tried all of the solutions which are listed here however they all didn't work. I have tried different versions of python such as 3.7, 2.7, 3.9 and the result was same.

Finally I have noticed that the file with the extension .npy is corrupted so it gives out this error. Here is the line giving the error.

npyFile = np.load('file1.npy')

So whoever come accross the same problem first of all it would be better to check the .npy file.

Azine answered 29/9, 2022 at 10:37 Comment(2)
How do you check the .npy file if you can't load it?Curly
it can be opened via on geditAzine
P
2

Python uses a native data serialization module called Pickle. Nested data (like a list of lists) is serialized using pickle and NumPy warns against pickling.

Warning: Loading files that contain object arrays uses the pickle module, which is not secure against erroneous or maliciously constructed data. Consider passing allow_pickle=False to load data that is known not to contain object arrays for the safer handling of untrusted sources.

You might be saving an array which consists a single dataFrame. This causes pickling. Example:

x =  array([[ 0.1,  0.1,  0.1],
       [ 0.1,  0.1,  0.1],
       [ 0.1,  0.1,  0.1],
       [ 0.1,  0.1,  0.1],
       [ 0.1,  0.1,  0.1],
       [ 0.1,  0.1,  0.1],
       [ 0.1,  0.1,  0.1]])

In that case, try saving just the numpy array as np.save(filename, x[0]). This will not use any pickling to save your data and resolves the issue.

Paramount answered 7/4, 2020 at 0:4 Comment(0)
B
1

The OSError suggests you could be having a python 2/python 3 issue. I had the same problem and errors when I was trying to read a file with python 3 that had been written in python 2. For me, using the np.load command with the following arguments worked:

np.load('file.npy',allow_pickle=True,fix_imports=True,encoding='latin1')

The doc for numpy.load says about the encoding argument, "Only useful when loading Python 2 generated pickled files in Python 3, which includes npy/npz files containing object arrays."

Borlase answered 22/12, 2020 at 19:16 Comment(0)
C
1

my hypothesis is when I am picking npz file, some other task is writing in that npz and in that case np.load(file_path,mmap_mode='r') gives 'Cannot load file containing pickled data when allow_pickle=False'.

So reading it after some time fixed it.

Chronicles answered 19/1, 2023 at 11:30 Comment(0)
Y
1

In my case the file was corrupted. However this happened in some unexpected way that is why I post this here.

TLDR: calling np.load with wrong access mode can already corrupt your file so you need to export it again.


I started to save my file as usual:

with open('data.npy', 'wb') as f:
    np.save(f, data)

In a different script I copy pasted the code and modified it to call load instead of save:

with open('data.npy', 'wb') as f: # <-- wrong access mode 
    data = np.load(f)

Note that I forget to change the access mode! I got some UnsupportedOperation error and just corrected the access mode and didn't thought much of it first. The following attempt to load the file resulted in the error ValueError: Cannot load file containing pickled data when allow_pickle=False even when calling np.load(data, allow_pickle=True). Everything works fine after saving the file again and this time using np.load(data) with the correct access code worked even without allow_pickle=True.

I didn't expect that this single np.load call with the wrong access mode would already corrupt my file. Hope this helps someone!

Yearn answered 10/2, 2023 at 8:7 Comment(2)
Everyone here is saying that the file might be corrupted but no one is explaining why. In my my case the data was very simple so I couldn't figure out why it was corrupted. My issue was exactly the same as this answer, reading the npy in write mode instead of read, which corrupted the fileCurly
This is a bit scary that a single typo could corrupt data in a potentially irreversible way...Curly
T
0

I had the same issue. Try np.loadtxt instead.

Thrasonical answered 7/6, 2022 at 9:4 Comment(1)
Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.Leipzig
Y
0

I uploaded my documents to drive and I uploaded the documents from the drive. It is solved.

from google.colab import drive
drive.mount("/content/drive")
label = np.load("path/labels.npy") 
Yandell answered 16/11, 2022 at 14:39 Comment(0)
P
0

In my case — and yes, this was a rather silly issue — I needed to do np.loadtxt, not np.load, as my file was a simple text file.

https://numpy.org/doc/stable/reference/generated/numpy.loadtxt.html

Pictorial answered 21/11, 2023 at 23:26 Comment(0)
M
0

In my case, i met the same problem as you wrote in your question. I found that if the .npy file is empty (0 kb) , this error would occur. so maybe just check the .npy file.

Mosqueda answered 1/3 at 14:44 Comment(0)
O
0

Possible source of the problem is that the file is in fact not in .npy format (for example, it is in plain text format obtained by numpy.savetxt() instead of numpy.save()) or the file is broken somehow else.

Overthrust answered 16/5 at 9:38 Comment(0)
W
-2

Just make sure the file isn't corrupted.

Wonderment answered 7/12, 2021 at 8:28 Comment(2)
Please, give more information. Prefereably an example and a code section.Vasiliki
Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.Leipzig

© 2022 - 2024 — McMap. All rights reserved.