Football prediction program encog: Inconsistent predictions
Asked Answered
S

1

9

I am making a program that predicts outcome of a football match using encog. I have created a neural network, trained it with data of 90 matches with resilient propagation training method. I have marked results of match as 1 for home win, 0 for draw and -1 for away win.

Problem is in prediction. Sometimes i get sucess rate of 50% and other time i get as low as 33%. It is like using random function. What i have noticed is that almost alwayst the most predicted outcome is 1(around 70%). I have tried changing the number of hidden layers, number of training but with no luck, it is still oscillating.Can anyone please help me or pint me into right direction if i am doing something wrong.

Here is the code for neural network. I am getting training data, and prediction data from database.

Predictor(NeuralDataSet trainingData){
    trainingSet = trainingData;
    network = new BasicNetwork();
    network.addLayer(new BasicLayer(16));
    network.addLayer(new BasicLayer(3));
    network.addLayer(new BasicLayer(1));
    network.getStructure().finalizeStructure();
    network.reset();
}

Training

public void train(int epoch){
    int i =0;
    final Train train =new ResilientPropagation(network,trainingSet);
    while(i<=epoch){
        train.iteration();
        i++;
    }

}

Predicting

public void successRate(NeuralDataSet trainingData){
    int counter = 0;
    int correct = 0;
    int home=0;
    int away=0;
    int draw=0;
    for(MLDataPair pair: trainingData ) {
        final MLData output = network.compute(pair.getInput());
        if(pair.getIdeal().getData(0)==Math.round(output.getData(0)))
            correct++;
        counter++;
    }
    System.out.println((double)correct/(double)counter);
}

1.) I am feeding the data to neural network 1000. Currently testing with more/less since things got better.

2,3.) I have 16 input parameters. They consists of: Home team points, home team home wins,draws,losses, home team total won, lost, draws and form(points gain in last 5 matches). Same data goes for away team only instead of home team home wins,draws,losses away team away wins,draws,losses is used. I ll try with different training data.

Sheldon answered 16/8, 2012 at 0:18 Comment(0)
L
5

It is hard to say what is wrong given the information, there could be multiple reasons. But here are some potential solutions.

1) How many times are you feeding the training data to the neural network? Usually you will need to make multiple passes feeding in the training data to make the network converge. One time is not enough especially if you only have 90 training data.

2) How many input parameters are in the training data (and what are they)? Typically You need to adjust the number of hidden layers nodes to the number of input parameters. There are no hard rules to this, but I typically start with at least twice the number of hidden layer nodes as input parameters.

3)Have you tried choosing different testing data? I'm assuming that your training and testing data is different. There could be something wrong with the testing data you've selected, as in they do not match up with the training data at all. It could also be entirely possible that it is impossible to get any reliable estimation from your methods. Your input parameters may be completely insufficient to predict who wins any given match. This is the garbage in , garbage out, concept.

Lautrec answered 16/8, 2012 at 14:3 Comment(3)
Thank you for answer. The part about number of hidden nodes helped alot since i now get prediction that are far more realistic then before when min of 80% predictions were either 1 or 0. I will update the first post with answers, please check it out and place a comment!Sheldon
I'm having the same problem using Encog. Any updates on this?Rodgerrodgers
When you say "you will need to make multiple passes feeding in the training data to make the network converge," what do you mean? When I train my data, I pass through it via do { train.Iteration() } while (train.Error > 0.001);. Is this not enough? Note: My data is not converging, the error rate is 92%, and I only have about 200 rows of data.Sparks

© 2022 - 2024 — McMap. All rights reserved.