I'm looking at a the cs file here: https://www.microsoft.com/net/learn/apps/machine-learning-and-ai/ml-dotnet/get-started/windows and everything works well.
Now I'd like to improve the example: I'd like to predict a number-only data set and not a number-string dataset, for example predict the ouput of a seven segments display.
Here is my super easy dataset, the last column is the int number that I want to predict:
1,0,1,1,1,1,1,0
0,0,0,0,0,1,1,1
1,1,1,0,1,1,0,2
1,1,1,0,0,1,1,3
0,1,0,1,0,1,1,4
1,1,1,1,0,0,1,5
1,1,1,1,1,0,1,6
1,0,0,0,0,1,1,7
1,1,1,1,1,1,1,8
1,1,1,1,0,1,1,9
And here is my test code:
public class Digit
{
[Column("0")] public float Up;
[Column("1")] public float Middle;
[Column("2")] public float Bottom;
[Column("3")] public float UpLeft;
[Column("4")] public float BottomLeft;
[Column("5")] public float TopRight;
[Column("6")] public float BottomRight;
[Column("7")] [ColumnName("DigitValue")]
public float DigitValue;
}
public class DigitPrediction
{
[ColumnName("PredictedDigits")] public float PredictedDigits;
}
public PredictDigit()
{
var pipeline = new LearningPipeline();
var dataPath = Path.Combine("Segmenti", "segments.txt");
pipeline.Add(new TextLoader<Digit>(dataPath, false, ","));
pipeline.Add(new ColumnConcatenator("Label", "DigitValue"));
pipeline.Add(new ColumnConcatenator("Features", "Up", "Middle", "Bottom", "UpLeft", "BottomLeft", "TopRight", "BottomRight"));
pipeline.Add(new StochasticDualCoordinateAscentClassifier());
var model = pipeline.Train<Digit, DigitPrediction>();
var prediction = model.Predict(new Digit
{
Up = 1,
Middle = 1,
Bottom = 1,
UpLeft = 1,
BottomLeft = 1,
TopRight = 1,
BottomRight = 1,
});
Console.WriteLine($"Predicted digit is: {prediction.PredictedDigits}");
Console.ReadLine();
}
As you can see it is very similar to the example provided except the last column ("Label") handling beacause I need to predict a number and not a string. I try with:
pipeline.Add(new ColumnConcatenator("Label", "DigitValue"));
but it does not work, exception:
Training label column 'Label' type is not valid for multi-class: Vec<R4, 1>. Type must be R4 or R8.
I'm sure I miss something but actually I cannot find anything on internet that can help me solve this problem.
UPDATE
I found that the dataset have to have a Label
column like this:
[Column("7")] [ColumnName("Label")] public float Label;
and the DigitPrediction
a Score
column like:
public class DigitPrediction
{
[ColumnName("Score")] public float[] Score;
}
Now the system "works" and I got as prediction.Score
a Single[]
value where the index associated with the higher value is the predicted value.
Is it the right approach?
UPDATE 2 - Complete code example
Following the answer and other suggestions I got the right result, if you need it you can find complete code here.