I have a large image classification dataset stored in the format .hdf5
. The dataset has the labels and the images stored in the .hdf5
file. I am unable to view the images as they are store in form of an array. The dataset reading code that I have used is as follows,
import h5py
import numpy
f = h5py.File('data/images.hdf5', 'r')
print(list(f.keys()))
['datasets']
group = f['datasets']
list(group.keys())
['car']
Now when I read the group cars
I have the following output,
data = group['car']
data.shape,data[0].shape,data[1].shape
((51,), (383275,), (257120,)
So it looks like there are 51
images for label car
and images are stored as 383275
and 257120
dimensional arrays, with no information about their height and width dimensions. I want to save the images as RGB again.
Next following the code here, I tried to read the images.
import numpy as np
from PIL import Image
# hdf = h5py.File("Sample.h5",'r')
array = data[0]
img = Image.fromarray(array.astype('uint8'), 'RGB')
img.save("yourimage.thumbnail", "JPEG")
img.show()
Unfortunately, the following error is received.
File /usr/local/lib/python3.8/dist-packages/PIL/Image.py:784, in Image.frombytes(self, data, decoder_name, *args)
781 s = d.decode(data)
783 if s[0] >= 0:
--> 784 raise ValueError("not enough image data")
785 if s[1] != 0:
786 raise ValueError("cannot decode image data")
ValueError: not enough image data
References I have already checked the hdf group help library etc. Any help will be highly useful. Thanks.