The datasets '.t7' are tables of labeled Tensors.
For example the following lua code :
if (not paths.filep("cifar10torchsmall.zip")) then
os.execute('wget -c https://s3.amazonaws.com/torch7/data/cifar10torchsmall.zip')
os.execute('unzip cifar10torchsmall.zip')
end
Readed_t7 = torch.load('cifar10-train.t7')
print(Readed_t7)
Will return through itorch :
{
data : ByteTensor - size: 10000x3x32x32
label : ByteTensor - size: 10000
}
Which means the file contains a table of two ByteTensor one labeled "data" and the other one labeled "label".
To answer your question, you should first read your images (with torchx for example : https://github.com/nicholas-leonard/torchx/blob/master/README.md ) then put them in a table with your Tensor of label. The following code is just a draft to help you out. It considers the case where : there are two classes, all your images are in the same folder and are ordered through those classes.
require 'torchx';
--Read all your dataset (the chosen extension is png)
files = paths.indexdir("/Path/to/your/images/", 'png', true)
data1 = {}
for i=1,files:size() do
local img1 = image.load(files:filename(i),3)
table.insert(data1, img1)
end
--Create the table of label according to
label1 = {}
for i=1, #data1 do
if i <= number_of_images_of_the_first_class then
label1[i] = 1
else
label1[i] = 2
end
end
--Reshape the tables to Tensors
label = torch.Tensor(label1)
data = torch.Tensor(#data1,3,16,16)
for i=1, #data1 do
data[i] = data1[i]
end
--Create the table to save
Data_to_Write = { data = data, label = label }
--Save the table in the /tmp
torch.save("/tmp/Saved_Data.t7", Data_to_Write)
It should be possible to make a less hideous code but this one details all the steps and works with torch 7 and Jupyter 5.0.0 .
Hope it helps.
Regards
.7z
extension corresponds to a file compressed with 7-zip (7-zip.org). If people are supplying you with data in this format, you would need to uncompress it before it is used. i.e. I highly doubt torch takes .7z files. – Wink