After parsing many links regarding building Caffe layers in Python i still have difficulties in understanding few concepts. Can please someone clarify them?
- Blobs and weights python structure for network is explained here: Finding gradient of a Caffe conv-filter with regards to input.
- Network and Solver structure is explained here: Cheat sheet for caffe / pycaffe?.
- Example of defining python layer is here: pyloss.py on git.
- Layer tests here: test layer on git.
- Development of new layers for C++ is described here: git wiki.
What I am still missing is:
setup()
method: what I should do here? Why in example I should compare the lenght of 'bottom' param with '2'? Why it should be 2? It seems not a batch size because its arbitrary? And bottom as I understand is blob, and then the first dimension is batch size?reshape()
method: as I understand 'bottom' input param is blob of below layer, and 'top' param is blob of upper layer, and I need to reshape top layer according to output shape of my calculations with forward pass. But why do I need to do this every forward pass if these shapes do not change from pass to pass, only weights change?reshape
andforward
methods have 0 indexes for 'top' input param used. Why would I need to usetop[0].data=...
ortop[0].input=...
instead oftop.data=...
andtop.input=...
? Whats this index about? If we do not use other part of this top list, why it is exposed in this way? I can suspect its or C++ backbone coincidence, but it would be good to know exactly.reshape()
method, line with:if bottom[0].count != bottom[1].count
what I do here? why its dimension is 2 again? And what I am counting here? Why both part of blobs (0 and 1) should be equal in amount of some members (
count
)?forward()
method, what I define by this line:self.diff[...] = bottom[0].data - bottom[1].data
When it is used after forward path if I define it? Can we just use
diff = bottom[0].data - bottom[1].data
instead to count loss later in this method, without assigning to
self
, or its done with some purpose?backward()
method: what's this about:for i in range(2):
? Why again range is 2?backward()
method,propagate_down
parameter: why it is defined? I mean if its True, gradient should be assigned tobottom[X].diff
as I see, but why would someone call method which would do nothing withpropagate_down = False
, if it just do nothing and still cycling inside?
I'm sorry if those questions are too obvious, I just wasn't able to find a good guide to understand them and asking for help here.