Generating LMDB for Caffe
Asked Answered
J

3

9

I am trying to build a deep learning model for Saliency analysis using caffe (I am using the python wrapper). But I am unable to understand how to generate the lmdb data structure for this purpose. I have gone through the Imagenet and mnist examples and I understand that I should generate labels in the format

my_test_dir/picture-foo.jpg 0

But in my case, I will be labeling each pixel with 0 or 1 indicating whether that pixel is salient or not. That won't be a single label for an image.

How to generate lmdb files for a per pixel based labeling ?

Jaynajayne answered 10/11, 2015 at 10:29 Comment(1)
the current interface of caffe and lmdb supports only a single label per image (suitable for classification tasks). However, you can use HDF5 interface. See this answer.Horseweed
H
10

You can approach this problem in two ways:

1. Using HDF5 data layer instead of LMDB. HDF5 is more flexible and can support labels the size of the image. You can see this answer for an example of constructing and using HDF5 input data layer.

2. You can have two LMDB input layers: one for the image and one for the label. Note that when you build the LMDB you must not use the 'shuffle' option in order to have the images and their labels in sync.

Update: I recently gave a more detailed answer here.

Horseweed answered 10/11, 2015 at 11:29 Comment(3)
That's really helpful. HDF5 method is quite clear. However, out of curiosity, how do I go about adding two layers following the second method ? I am not able to understand that!Jaynajayne
@Jaynajayne it should be quite easy: create two lmdb train_images_lmdb' and train_labels_lmdb` then you add two input layers one with top "image" and "dummy_label_1", the other with top "label_image" and "summy_label_2".Horseweed
Oh! Thanks a lot. Pretty helpful :)Jaynajayne
N
1

Check this one: http://deepdish.io/2015/04/28/creating-lmdb-in-python/

Just load all images in X and corresponding labels in Y.

Neogothic answered 2/12, 2015 at 8:47 Comment(1)
This is not even relevant to my question.Jaynajayne
N
0

In caffe both lmdb and hdf5 supports multiple labels per image, matrices if you like, see this thread:

https://github.com/BVLC/caffe/issues/1698#issue-53768814

See this tutorial on how to create a multi-label dataset (lmdb here) for caffe with python code:

http://www.kostyaev.me/article/Multilabel%20Dataset/

EDIT: For example for the labels it uses the caffe-python function which converts a 3-dimensional array to datum, found in caffe/python/caffe.io.py: array_to_datum(arr, label=None):

Nasalize answered 1/9, 2016 at 12:50 Comment(3)
Links to potential solutions are always welcome, but please add some details for future visitors in case the link is no longer available.Eel
Second link is dead, do you have another link to provide ? thanks.Homophony
The link was down for a while, it was not in the web archive but I now found it is back up again. And is now in the web archive. ThanksNasalize

© 2022 - 2024 — McMap. All rights reserved.