I am trying to solve this Kaggle Problem using Neural Networks. I am using Pybrain Python Library.
It's a classical supervised Learning Problem. In following code: 'data' variable is numpy array(892*8). 7 fields are my features and 1 field is my output value which can be '0' or '1'.
from pybrain.datasets import ClassificationDataSet
from pybrain.supervised.trainers import BackpropTrainer
from pybrain.tools.shortcuts import buildNetwork
dataset = ClassificationDataSet(7,1)
for i in data:
dataset.appendLinked(i[1:],i[0])
net = buildNetwork(7,9,7,1, bias = True,hiddenclass = SigmoidLayer, outclass = TanhLayer)
trainer = BackpropTrainer(net, learningrate = 0.04, momentum = 0.96, weightdecay = 0.02, verbose = True)
trainer.trainOnDataset(dataset, 8000)
trainer.testOnData(verbose = True)
After training my Neural Network, when I am testing it on Training Data, its always giving a single output for all inputs. Like:
Testing on data:
out: [ 0.075]
correct: [ 1.000]
error: 0.42767858
out: [ 0.075]
correct: [ 0.000]
error: 0.00283875
out: [ 0.075]
correct: [ 1.000]
error: 0.42744569
out: [ 0.077]
correct: [ 1.000]
error: 0.42616996
out: [ 0.076]
correct: [ 0.000]
error: 0.00291185
out: [ 0.076]
correct: [ 1.000]
error: 0.42664586
out: [ 0.075]
correct: [ 1.000]
error: 0.42800026
out: [ 0.076]
correct: [ 1.000]
error: 0.42719380
out: [ 0.076]
correct: [ 0.000]
error: 0.00286796
out: [ 0.076]
correct: [ 0.000]
error: 0.00286642
out: [ 0.076]
correct: [ 1.000]
error: 0.42696969
out: [ 0.076]
correct: [ 0.000]
error: 0.00292401
out: [ 0.074]
correct: [ 0.000]
error: 0.00274975
out: [ 0.076]
correct: [ 0.000]
error: 0.00286129
I have tried altering learningRate, weightDecay, momentum, number of hidden units, number of hidden layers, class of hidden layers, class of output layers so as resolve it, but in every case it gives same output for every input if input comes from Training Data.
I think I should run it more than 8000 times because when I was building Neural Network for 'XOR', It took atleast 700 iterations before it started giving errors on nano scale. Training data size on 'XOR' was only 4 whereas in this case it is 892. So I ran 8000 iterations on 10 % of the original data(Now size of Training Data is 89), even then it was giving same output for every input in Training Data. And since I want to classify input into '0' or '1', if I'm using class of Output Layer to be Softmax, then it is always giving '1' as output.
No matter which configuration(no. of hidden units, class of output layer, learning rate, class of hidden layer, momentum), was I using in 'XOR', it more or less started converging in every case.
Is is possible that there is some configuration that will finally yield lower error rates. Atleast some configuration so that it won't give same output for all inputs in Training Data.
I ran it for 80,000 iteration(Training Data Size is 89). Output Sample:
Testing on data:
out: [ 0.340]
correct: [ 0.000]
error: 0.05772102
out: [ 0.399]
correct: [ 0.000]
error: 0.07954010
out: [ 0.478]
correct: [ 1.000]
error: 0.13600274
out: [ 0.347]
correct: [ 0.000]
error: 0.06013008
out: [ 0.500]
correct: [ 0.000]
error: 0.12497886
out: [ 0.468]
correct: [ 1.000]
error: 0.14177601
out: [ 0.377]
correct: [ 0.000]
error: 0.07112816
out: [ 0.349]
correct: [ 0.000]
error: 0.06100758
out: [ 0.380]
correct: [ 1.000]
error: 0.19237095
out: [ 0.362]
correct: [ 0.000]
error: 0.06557341
out: [ 0.335]
correct: [ 0.000]
error: 0.05607577
out: [ 0.381]
correct: [ 0.000]
error: 0.07247926
out: [ 0.355]
correct: [ 1.000]
error: 0.20832669
out: [ 0.382]
correct: [ 1.000]
error: 0.19116165
out: [ 0.440]
correct: [ 0.000]
error: 0.09663233
out: [ 0.336]
correct: [ 0.000]
error: 0.05632861
Average error: 0.112558819082
('Max error:', 0.21803000849096299, 'Median error:', 0.096632332865968451)
It's giving all outputs within range(0.33, 0.5).
(LR * (N - i)/N)
. What are the bounds? I suppose they are different for different features? Then normalize them to be within the same range. Use either Sigmoid or Tanh layers in all net. If you'll use Sigmoid - leave 0 and 1. If you'll use Tanh - use [-1, +1] as outputs. – Ludovika