How to translate the intro ML.Net demo to F#?
Asked Answered
I

1

8

I'm looking at a the cs file here: https://www.microsoft.com/net/learn/apps/machine-learning-and-ai/ml-dotnet/get-started/windows and in my attempt to translate it to F# it compiles just fine but throws a System.Reflection.TargetInvocationException when run: FormatException: One of the identified items was in an invalid format. What am I missing?

Editted: Was using records before

open Microsoft.ML
open Microsoft.ML.Runtime.Api
open Microsoft.ML.Trainers
open Microsoft.ML.Transforms
open System

type IrisData = 
    [<Column("0")>] val mutable SepalLength : float
    [<Column("1")>] val mutable SepalWidth : float
    [<Column("2")>] val mutable PetalLength : float
    [<Column("3")>] val mutable PetalWidth : float
    [<Column("4");ColumnName("Label")>] val mutable Label : string

    new(sepLen, sepWid, petLen, petWid, label) = 
        { SepalLength = sepLen
          SepalWidth = sepWid
          PetalLength = petLen
          PetalWidth =  petWid
          Label = label }

type IrisPrediction = 
    [<ColumnName("PredictedLabel")>] val mutable PredictedLabels : string
    new() = { PredictedLabels = "Iris-setosa" }


[<EntryPoint>]
let main argv = 
    let pipeline = new LearningPipeline()
    let dataPath = "iris.data.txt"
    pipeline.Add(new TextLoader<IrisData>(dataPath,separator = ","))
    pipeline.Add(new Dictionarizer("Label"))
    pipeline.Add(new ColumnConcatenator("Features", "SepalLength", "SepalWidth", "PetalLength", "PetalWidth"))
    pipeline.Add(new StochasticDualCoordinateAscentClassifier())
    pipeline.Add(new PredictedLabelColumnOriginalValueConverter(PredictedLabelColumn = "PredictedLabel") )    
    let model = pipeline.Train<IrisData, IrisPrediction>()


    let prediction = model.Predict(IrisData(3.3, 1.6, 0.2, 5.1,""))

    Console.WriteLine("Predicted flower type is: {prediction.PredictedLabels}")

    0 // return an integer exit code
Inkblot answered 14/5, 2018 at 3:13 Comment(4)
F# equivalent to C# IrisData and IrisPrediction classes used in the tutorial are custom types (POCOs), not F# records used in your code.Causerie
Using POCOs still throws the same error.Inkblot
Fields in your POCOs are not public as in C#.Causerie
Note that there is no longer a need for ugly mutable fields: github.com/dotnet/machinelearning/pull/616 . So the style of code in the question and answers doesn't correspond to the current state of ML.Net.Youngstown
C
8

You may find below a working F# version of code for the ML tutorial, using Microsoft.ML 0.1.0 (might break with newer versions). Two major differences from your code that make the sample work are both within IrisData and IrisPredictiontype definitions:

  • Accurate presentation of C# POCO in F# having parameterless constructor and public access to the fields
  • Correct porting of C# float to F#, which is float32

Here is the code

open Microsoft.ML
open Microsoft.ML.Runtime.Api
open Microsoft.ML.Trainers
open Microsoft.ML.Transforms
open System

type IrisData() =
    [<Column("0")>]
    [<DefaultValue>]
    val mutable public SepalLength: float32
    [<DefaultValue>]
    [<Column("1")>]
    val mutable public SepalWidth: float32
    [<DefaultValue>]
    [<Column("2")>]
    val mutable public PetalLength:float32
    [<DefaultValue>]
    [<Column("3")>]
    val mutable public PetalWidth:float32
    [<DefaultValue>]
    [<Column("4")>]
    [<ColumnName("Label")>]
    val mutable public Label:string

type IrisPrediction() =
    [<ColumnName("PredictedLabel")>]
    [<DefaultValue>]
    val mutable public PredictedLabel : string

[<EntryPoint>]
let main argv =
    let pipeline = new LearningPipeline()
    let dataPath = "iris.data.txt"
    let a = IrisPrediction()
    pipeline.Add(new TextLoader<IrisData>(dataPath,separator = ","))
    pipeline.Add(new Dictionarizer("Label"))
    pipeline.Add(new ColumnConcatenator("Features", "SepalLength", "SepalWidth", "PetalLength", "PetalWidth"))
    pipeline.Add(new StochasticDualCoordinateAscentClassifier())
    pipeline.Add(new PredictedLabelColumnOriginalValueConverter(PredictedLabelColumn = "PredictedLabel") )    
    let model = pipeline.Train<IrisData, IrisPrediction>()

    let x = IrisData()
    x.SepalLength <- 3.3f
    x.SepalWidth <- 1.6f
    x.PetalLength <- 0.2f
    x.PetalWidth <- 5.1f
    let prediction = model.Predict(x)

    printfn "Predicted flower type is: %s"  prediction.PredictedLabel

    0

and the output it produces:

Automatically adding a MinMax normalization transform, use 'norm=Warn' or 'norm=No' to turn this behavior off.
Using 4 threads to train.
Automatically choosing a check frequency of 4.
Auto-tuning parameters: maxIterations = 9996.
Auto-tuning parameters: L2 = 2.668802E-05.
Auto-tuning parameters: L1Threshold (L1/L2) = 0.
Using best model from iteration 892.
Not training a calibrator because it is not needed.
Predicted flower type is: Iris-virginica
Press any key to continue . . .
Causerie answered 14/5, 2018 at 17:58 Comment(1)
As of today (Microsoft.ML 0.2), this needs an additional open Microsoft.ML.Data in the header and pipeline.Add(TextLoader(dataPath).CreateFrom<IrisData>(separator=',')) to load the CSV, see also #50910870Barnabas

© 2022 - 2024 — McMap. All rights reserved.