How to create simple 3-layer neural network and teach it using supervised learning?

#!/usr/bin/env python2 # coding: utf-8 from pybrain.structure import FeedForwardNetwork, LinearLayer, SigmoidLayer, FullConnection from pybrain.datasets import SupervisedDataSet from pybrain.supervised.trainers import BackpropTrainer n = FeedForwardNetwork() inLayer = LinearLayer(2) hiddenLayer = SigmoidLayer(3) outLayer = LinearLayer(1) n.addInputModule(inLayer) n.addModule(hiddenLayer) n.addOutputModule(outLayer) in_to_hidden = FullConnection(inLayer, hiddenLayer) hidden_to_out = FullConnection(hiddenLayer, outLayer) n.addConnection(in_to_hidden) n.addConnection(hidden_to_out) n.sortModules() ds = SupervisedDataSet(2, 1) ds.addSample((0, 0), (0,)) ds.addSample((0, 1), (1,)) ds.addSample((1, 0), (1,)) ds.addSample((1, 1), (0,)) trainer = BackpropTrainer(n, ds) # trainer.train() trainer.trainUntilConvergence() print n.activate([0, 0])[0] print n.activate([0, 1])[0] print n.activate([1, 0])[0] print n.activate([1, 1])[0]

There are four problems with your approach, all easy to identify after reading Neural Network FAQ:

Why use a bias/threshold?: you should add a bias node. Lack of bias makes the learning very limited: the separating hyperplane represented by the network can only pass through the origin. With the bias node, it can move freely and fit the data better:
```
bias = BiasUnit()
n.addModule(bias)

bias_to_hidden = FullConnection(bias, hiddenLayer)
n.addConnection(bias_to_hidden)
```
Why not code binary inputs as 0 and 1?: all your samples lay in a single quadrant of the sample space. Move them to be scattered around the origin:
```
ds = SupervisedDataSet(2, 1)
ds.addSample((-1, -1), (0,))
ds.addSample((-1, 1), (1,))
ds.addSample((1, -1), (1,))
ds.addSample((1, 1), (0,))
```
^{(Fix the validation code at the end of your script accordingly.)}
trainUntilConvergence method works using validation, and does something that resembles the early stopping method. This doesn't make sense for such a small dataset. Use trainEpochs instead. 1000 epochs is more than enough for this problem for the network to learn:
```
trainer.trainEpochs(1000)
```
What learning rate should be used for backprop?: Tune the learning rate parameter. This is something you do every time you employ a neural network. In this case, the value 0.1 or even 0.2 dramatically increases the learning speed:
```
trainer = BackpropTrainer(n, dataset=ds, learningrate=0.1, verbose=True)
```
^{(Note the verbose=True parameter. Observing how the error behaves is essential when tuning parameters.)}

With these fixes I get consistent, and correct results for the given network with the given dataset, and error less than 1e-23.

Recommended topics

Hot tags