How to create simple 3-layer neural network and teach it using supervised learning?
Asked Answered
I

1

6

Based on PyBrain's tutorials I managed to knock together the following code:

#!/usr/bin/env python2
# coding: utf-8

from pybrain.structure import FeedForwardNetwork, LinearLayer, SigmoidLayer, FullConnection
from pybrain.datasets import SupervisedDataSet
from pybrain.supervised.trainers import BackpropTrainer

n = FeedForwardNetwork()

inLayer = LinearLayer(2)
hiddenLayer = SigmoidLayer(3)
outLayer = LinearLayer(1)

n.addInputModule(inLayer)
n.addModule(hiddenLayer)
n.addOutputModule(outLayer)

in_to_hidden = FullConnection(inLayer, hiddenLayer)
hidden_to_out = FullConnection(hiddenLayer, outLayer)

n.addConnection(in_to_hidden)
n.addConnection(hidden_to_out)

n.sortModules()

ds = SupervisedDataSet(2, 1)
ds.addSample((0, 0), (0,))
ds.addSample((0, 1), (1,))
ds.addSample((1, 0), (1,))
ds.addSample((1, 1), (0,))

trainer = BackpropTrainer(n, ds)
# trainer.train()
trainer.trainUntilConvergence()

print n.activate([0, 0])[0]
print n.activate([0, 1])[0]
print n.activate([1, 0])[0]
print n.activate([1, 1])[0]

It's supposed to learn XOR function, but the results seem quite random:

0.208884929522

0.168926515771

0.459452834043

0.424209192223

or

0.84956138664

0.888512762786

0.564964077401

0.611111147862

Interference answered 18/9, 2015 at 15:19 Comment(0)
B
8

There are four problems with your approach, all easy to identify after reading Neural Network FAQ:

  • Why use a bias/threshold?: you should add a bias node. Lack of bias makes the learning very limited: the separating hyperplane represented by the network can only pass through the origin. With the bias node, it can move freely and fit the data better:

    bias = BiasUnit()
    n.addModule(bias)
    
    bias_to_hidden = FullConnection(bias, hiddenLayer)
    n.addConnection(bias_to_hidden)
    
  • Why not code binary inputs as 0 and 1?: all your samples lay in a single quadrant of the sample space. Move them to be scattered around the origin:

    ds = SupervisedDataSet(2, 1)
    ds.addSample((-1, -1), (0,))
    ds.addSample((-1, 1), (1,))
    ds.addSample((1, -1), (1,))
    ds.addSample((1, 1), (0,))
    

    (Fix the validation code at the end of your script accordingly.)

  • trainUntilConvergence method works using validation, and does something that resembles the early stopping method. This doesn't make sense for such a small dataset. Use trainEpochs instead. 1000 epochs is more than enough for this problem for the network to learn:

    trainer.trainEpochs(1000)
    
  • What learning rate should be used for backprop?: Tune the learning rate parameter. This is something you do every time you employ a neural network. In this case, the value 0.1 or even 0.2 dramatically increases the learning speed:

    trainer = BackpropTrainer(n, dataset=ds, learningrate=0.1, verbose=True)
    

    (Note the verbose=True parameter. Observing how the error behaves is essential when tuning parameters.)

With these fixes I get consistent, and correct results for the given network with the given dataset, and error less than 1e-23.

Bobby answered 20/9, 2015 at 19:26 Comment(2)
Is there any method to train the network until mean error is lower (or equal) than required or the limit for number of epochs was reached?Interference
@Interference PyBrain's documentation honestly states that "This documentation comprises just a subjective excerpt of available methods". So you'll need to look into the implementation of your particular PyBrain distribution. But even if there is not, it's very easy to implement such loop by yourself.Bobby

© 2022 - 2024 — McMap. All rights reserved.