ML.Net 0.7 - Get Scores and Labels for MulticlassClassification
Asked Answered
U

2

6

I'm using ML.NET 0.7 and have a MulticlassClassification model with the following result class:

public class TestClassOut
{
  public string Id { get; set; }
  public float[] Score { get; set; }
  public string PredictedLabel { get; set; }
}

I'd like to know the scores and the corresponding labels on the Scores property. Feels like I should be able to make the property a Tuple<string,float> or similar to get the label that the score represents.

I understand that there was a method on V0.5:

model.TryGetScoreLabelNames(out scoreLabels);

But can't seem to find the equivalent in V0.7.

Can this be done? if so how?

Ultramontane answered 12/11, 2018 at 16:26 Comment(0)
P
1

This is probably not the answer you're looking for, but I ended up copying the code from TryGetScoreLabelNames (it's in the Legacy namespace as of 0.7) and tweaking it to use the schema from my input data. The dataView below is an IDataView I created from my prediction input data so I could get the schema off of it.

public bool TryGetScoreLabelNames(out string[] names, string scoreColumnName = DefaultColumnNames.Score)
{
    names = (string[])null;
    Schema outputSchema = model.GetOutputSchema(dataView.Schema);
    int col = -1;
    if (!outputSchema.TryGetColumnIndex(scoreColumnName, out col))
        return false;
    int valueCount = outputSchema.GetColumnType(col).ValueCount;
    if (!outputSchema.HasSlotNames(col, valueCount))
        return false;
    VBuffer<ReadOnlyMemory<char>> vbuffer = new VBuffer<ReadOnlyMemory<char>>();
    outputSchema.GetMetadata<VBuffer<ReadOnlyMemory<char>>>("SlotNames", col, ref vbuffer);
    if (vbuffer.Length != valueCount)
        return false;
    names = new string[valueCount];
    int num = 0;
    foreach (ReadOnlyMemory<char> denseValue in vbuffer.DenseValues())
        names[num++] = denseValue.ToString();
    return true;
}

I also asked this question in gitter for ml.net (https://gitter.im/dotnet/mlnet) and got this response from Zruty0

my best suggestion is to convert labels to 0..(N-1) beforehand, then train, and then inspect the resulting 'Score' column. It'll be a vector of size N, with per-class scores. PredictedLabel is actually just argmax(Score), and you can get the 2nd and other candidates by sorting Score

If you have a static set of classes this might be a better option, but my situation has an ever-growing set of classes.

Poppo answered 12/11, 2018 at 21:57 Comment(1)
Note ValueCount will be gone in 0.8 so you'll have to cast ((VectorType)col).Size instead.Pearlstein
J
1

This was asked a while ago, but I think that this is still a very relevant question that surprisingly has not got a lot of traction and isn't mentioned (as of the time of writing) in any of the Microsoft ML.NET tutorials. The sample code above needs a bit of tweaking to get it to work with v1.5 (preview), so I thought I'd post how I got it working for anyone else who stumbles across this.

In ConsumeModel.cs (assuming you're using the Model Builder in Visual Studio):

...
            // Use model to make prediction on input data
            ModelOutput result = predEngine.Predict(input);
            var labelNames = new List<string>();

            var column = predEngine.OutputSchema.GetColumnOrNull("label");
            if (column.HasValue)
            {
                VBuffer<ReadOnlyMemory<char>> vbuffer = new VBuffer<ReadOnlyMemory<char>>();
                column.Value.GetKeyValues(ref vbuffer);

                foreach (ReadOnlyMemory<char> denseValue in vbuffer.DenseValues())
                    labelNames.Add(denseValue.ToString());
            }
...

The end result that labelNames is now a parallel collection to result.Score. Just keep in mind that changes to the generated files could get overwritten if you rebuild the model using Model Builder.

Juback answered 12/5, 2020 at 1:55 Comment(1)
Surprised this hasn't received more love. Easily allowed me to return a tuple. Again surprised it isn't s feature. Thanks.Superimposed

© 2022 - 2024 — McMap. All rights reserved.