Simple accord.net machine learning example
Asked Answered
N

1

9

I’m new to machine learning and new to accord.net (I code C#).

I want to create a simple project where I look at a simple time series of data that oscillate, then I want accord.net to learn it and predict what the next value will be.

This is what the data (time series) should look like:

X - Y

1 - 1

2 - 2

3 - 3

4 - 2

5 - 1

6 - 2

7 - 3

8 - 2

9 - 1

Then I want it to predict the following:

X - Y

10 - 2

11 - 3

12 - 2

13 - 1

14 - 2

15 - 3

Can you guys help me out with some examples on how to solve it?

Nonplus answered 13/11, 2016 at 11:40 Comment(0)
S
14

A simple way to do this would be to use an Accord ID3 decision tree.

The trick is to work out what inputs to use - you can't just train on X - the tree won't learn anything about future values of X from that - however you can build some features derived from X (or previous values of Y) that will be useful.

Normally for problems like this - you would make each prediction based on features derived from previous values of Y (the thing being predicted) rather than X. However that assumes you can observe Y sequentially between each prediction (you can't then predict for any arbitary X) so I'll stick with the question as presented.

I had a go at building an Accord ID3 decision tree to solve this problem below. I used a few different values of x % n as the features - hoping the tree could work out the answer from this. In fact if I'd added (x-1) % 4 as a feature it could do it in a single level with just that attribute - but I guess the point is more to let the tree find the patterns.

And here is the code for that :

    // this is the sequence y follows
    int[] ysequence = new int[] { 1, 2, 3, 2 };

    // this generates the correct Y for a given X
    int CalcY(int x) => ysequence[(x - 1) % 4];

    // this generates some inputs - just a few differnt mod of x
    int[] CalcInputs(int x) => new int[] { x % 2, x % 3, x % 4, x % 5, x % 6 };


    // for https://mcmap.net/q/1158252/-simple-accord-net-machine-learning-example
    [TestMethod]
    public void AccordID3TestStackOverFlowQuestion2()
    {
        // build the training data set
        int numtrainingcases = 12;
        int[][] inputs = new int[numtrainingcases][];
        int[] outputs = new int[numtrainingcases];

        Console.WriteLine("\t\t\t\t x \t y");
        for (int x = 1; x <= numtrainingcases; x++)
        {
            int y = CalcY(x);
            inputs[x-1] = CalcInputs(x);
            outputs[x-1] = y;
            Console.WriteLine("TrainingData \t " +x+"\t "+y);
        }

        // define how many values each input can have
        DecisionVariable[] attributes =
        {
            new DecisionVariable("Mod2",2),
            new DecisionVariable("Mod3",3),
            new DecisionVariable("Mod4",4),
            new DecisionVariable("Mod5",5),
            new DecisionVariable("Mod6",6)
        };

        // define how many outputs (+1 only because y doesn't use zero)
        int classCount = outputs.Max()+1;

        // create the tree
        DecisionTree tree = new DecisionTree(attributes, classCount);

        // Create a new instance of the ID3 algorithm
        ID3Learning id3learning = new ID3Learning(tree);

        // Learn the training instances! Populates the tree
        id3learning.Learn(inputs, outputs);

        Console.WriteLine();
        // now try to predict some cases that werent in the training data
        for (int x = numtrainingcases+1; x <= 2* numtrainingcases; x++)
        {
            int[] query = CalcInputs(x);

            int answer = tree.Decide(query); // makes the prediction

            Assert.AreEqual(CalcY(x), answer); // check the answer is what we expected - ie the tree got it right
            Console.WriteLine("Prediction \t\t " + x+"\t "+answer);
        }
    }

This is the output it produces :

                 x   y
TrainingData     1   1
TrainingData     2   2
TrainingData     3   3
TrainingData     4   2
TrainingData     5   1
TrainingData     6   2
TrainingData     7   3
TrainingData     8   2
TrainingData     9   1
TrainingData     10  2
TrainingData     11  3
TrainingData     12  2

Prediction       13  1
Prediction       14  2
Prediction       15  3
Prediction       16  2
Prediction       17  1
Prediction       18  2
Prediction       19  3
Prediction       20  2
Prediction       21  1
Prediction       22  2
Prediction       23  3
Prediction       24  2

Hope that helps.

EDIT : Following comments, below the example is modified to train on previous values of the target (Y) - rather than features derived from the time index (X). This means you can't start training at the start of your series - as you need a back history of previous values of Y. In this example I started at x=9 just because that keeps the same sequence.

        // this is the sequence y follows
    int[] ysequence = new int[] { 1, 2, 3, 2 };

    // this generates the correct Y for a given X
    int CalcY(int x) => ysequence[(x - 1) % 4];

    // this generates some inputs - just a few differnt mod of x
    int[] CalcInputs(int x) => new int[] { CalcY(x-1), CalcY(x-2), CalcY(x-3), CalcY(x-4), CalcY(x - 5) };
    //int[] CalcInputs(int x) => new int[] { x % 2, x % 3, x % 4, x % 5, x % 6 };


    // for https://mcmap.net/q/1158252/-simple-accord-net-machine-learning-example
    [TestMethod]
    public void AccordID3TestTestStackOverFlowQuestion2()
    {
        // build the training data set
        int numtrainingcases = 12;
        int starttrainingat = 9;
        int[][] inputs = new int[numtrainingcases][];
        int[] outputs = new int[numtrainingcases];

        Console.WriteLine("\t\t\t\t x \t y");
        for (int x = starttrainingat; x < numtrainingcases + starttrainingat; x++)
        {
            int y = CalcY(x);
            inputs[x- starttrainingat] = CalcInputs(x);
            outputs[x- starttrainingat] = y;
            Console.WriteLine("TrainingData \t " +x+"\t "+y);
        }

        // define how many values each input can have
        DecisionVariable[] attributes =
        {
            new DecisionVariable("y-1",4),
            new DecisionVariable("y-2",4),
            new DecisionVariable("y-3",4),
            new DecisionVariable("y-4",4),
            new DecisionVariable("y-5",4)
        };

        // define how many outputs (+1 only because y doesn't use zero)
        int classCount = outputs.Max()+1;

        // create the tree
        DecisionTree tree = new DecisionTree(attributes, classCount);

        // Create a new instance of the ID3 algorithm
        ID3Learning id3learning = new ID3Learning(tree);

        // Learn the training instances! Populates the tree
        id3learning.Learn(inputs, outputs);

        Console.WriteLine();
        // now try to predict some cases that werent in the training data
        for (int x = starttrainingat+numtrainingcases; x <= starttrainingat + 2 * numtrainingcases; x++)
        {
            int[] query = CalcInputs(x);

            int answer = tree.Decide(query); // makes the prediction

            Assert.AreEqual(CalcY(x), answer); // check the answer is what we expected - ie the tree got it right
            Console.WriteLine("Prediction \t\t " + x+"\t "+answer);
        }
    }

You could also consider training on the differences between previous values of Y - which would work better where the absolute value of Y is not as important as the relative change.

Sonatina answered 13/11, 2016 at 18:58 Comment(7)
This is brilliant, i leaned a lot from this example (how to produce inputs and outputs) The example worked perfectly. But in the "real case", I can not use the X value for calculations, since it's a time serie (eg. x1 = 3:00AM, x2=4:00am, x3=5:00am), so I only have a time serie of all the Y values and want to find at patten here to help predict what the next Y value will be.... if that make sence?Nonplus
Sure - its more natural to use previous values of the target (Y) for time series - at least when the actual time is irrelevant and the relationship between the values is where the pattern lies.Sonatina
I'll edit the answer to add how the example can be modified to train on previous values of Y.Sonatina
Thank you a lot, I really appreciate your fast response and help. THANK YOU.Nonplus
Thank you @reddal, if output Ys are real numbers and there is no specific number for class counts, what you suggest to do. e.g. we have a series of numbers like { 0.4, 0.9, 0.3, 1.2, 0.7} and now we want to predict the next value.Boding
I use SimpleLinearRegression, may be there is better way.Boding
Would you mind to suggest ML framework and samples according to some certain problems, please? So we understant the problem, we know how to cook input values, we know which algorithm is selected to make prediction and we can try to predict something. Thanks!Displace

© 2022 - 2024 — McMap. All rights reserved.