Implementing sparse connections in neural network
Asked Answered
S

2

21

Some use cases for neural networks requires that not all neurons are connected between two consecutive layers. For my neural network architecture, I need to have a layer, where each neuron only has connections to some prespecified neurons in the previous layer (at somewhat arbitrary places, not with a pattern such as a convolution layer). This is needed in order to model data on a specific graph. I need to implement this "Sparse" layer in Theano, but I'm not used to the Theano way of programming.

It seems that the most efficient way of programming sparse connections in Theano would be to use theano.tensor.nnet.blocksparse.SparseBlockGemv. An alternative would be to do matrix multiplication, where many weights are set to 0 (= no connection), but that would be very inefficient compared to SparseBlockGemv as each neuron is only connected to 2-6 neurons in the previous layer out of ~100000 neurons. Moreover, a weight matrix of 100000x100000 would not fit on my RAM/GPU. Could someone therefore provide an example of how to implement sparse connections using the SparseBlockGemv method or another computationally-efficient method?

A perfect example would be to extend the MLP Theano Tutorial with an extra layer after the hidden layer (and before softmax), where each neuron only has connections to a subset of neurons in the previous layer. However, other examples are also very welcome!

Edit: Note that the layer must be implemented in Theano as it is just a small part of a larger architecture.

Suborn answered 5/4, 2016 at 22:11 Comment(1)
I've come to realise that SparseBlockGemv is not meant for general sparse block matrix (such as BSR), but for a dot operation over a large W matrix with only limited input/output combination.Theroid
P
1

The output of a fully-connected layer is given by the dot product of the input and the weights of that layer. In theano or numpy you can use the dot method.

y = x.dot(w)

If you only have connections to some neurons in the previous layer and those connections are predefined you could do something like this:

y = [x[edges[i]].dot(w[i])) for i in neurons]

Where edges[i] contains the indices for neurons connected to neuron i and w[i] the weights of this connection.

Please note, that theano doesn't know about layers or other high-level details.

Partisan answered 15/4, 2016 at 8:13 Comment(0)
T
1

Apologies for resurrecting an old thread, but this was the simplest guidance I found that was useful in extending the guidance at https://iamtrask.github.io/2015/07/12/basic-python-network/ for partially-connected inputs. However, it took me a while to make sense of basaundi's answer and I think I can improve upon it.

There were a couple of things that I needed to change to make it work. In my case, I am trying to map from N inputs to M neurons in my first hidden layer. My inputs are in a NxF array, where F is the number of features for my inputs, and my synapse values (weights) between inputs and the first layer are in a FxM array. Therefore, the output of Inputs <dot> Weights is a NxM array. My edge matrix is an MxF array that specifies for each neuron in layer 1 (rows), which of the features of the input data are relevant (columns).

In this setup, at least, it required me to slice my arrays differently than specified above. Also, the list comprehension returns a list of matrices, which must be summed to get the correct NxM (otherwise you get an MxNxM array).

So I have used the following (util.sigmoid is a helper function of my own):

y = [numpy.dot(x[:, edges[i]], w[edges[i]]) 
     for i in range(M)]
y = util.sigmoid(numpy.sum(y, 0))

This seems to work for me.

Turgent answered 14/3, 2018 at 17:19 Comment(3)
"took me a while to make sense of this answer [...]" Please include the answer in your "answer," because the blog post could disappear long before this post does. We don't need the whole blog post, just enough to answer the question completely.Corena
And, Welcome to Stack Overflow!Corena
You misunderstand my comment. "answer" refers to the message to which I am replying. The link is provided only for context if there is ambiguity in terms of differences between my implementation and the implementation of the original poster.Turgent

© 2022 - 2024 — McMap. All rights reserved.