ML.Net retrain existing model rather than training new model
Asked Answered
B

2

10

I am training a ML.Net machine learning model. I can train it and predict from it, and save/load it from the disk. But I need to be able to load it off the disk, then retrain it, or add to it with new information to improve it over time.

Does anyone know if this is possible? I have not found anything in the MS docs of how to do it, but it's a pretty standard thing for ML, so I'd be surprised if it's not possible.

Thanks

Bearish answered 23/9, 2018 at 22:28 Comment(0)
D
7

If you do end up looking into ML.NET, I recommend looking at the ML.NET Model Builder - there's a really straightforward tutorial here. Essentially you can download a Visual Studio extension which allows you to use a GUI to create a new model. It even runs your data through a bunch of machine learning algorithms and evaluates the most accurate one to use. Once your model is created, the Visual Studio extension generates the source code it used to create the model, so you can review exactly what it did and make adjustments where needed!

The model it creates can be easily retrained at any point. You can follow the Microsoft documentation here to retrain the model. All you need to do is load the model and pipeline that it previously used, then run a new set of data through them. It then saves the retrained model back to disk.

Just a note that I found by saving the pipeline .zip file to disk when initially creating the model made it easier later on when retraining.

Disharmonious answered 28/6, 2019 at 1:20 Comment(0)
S
6

This functionality exists in ML.NET but it is not possible with the existing LearningPipeline APIs. This will be exposed in the new ML.NET APIs and a sample enabling this scenario can be found here. The relevant code is:

// Train the first predictor.
var trainer = new LinearClassificationTrainer(env, new LinearClassificationTrainer.Arguments
{
    NumThreads = 1
}, "Features", "Label");
var firstModel = trainer.Fit(trainData);

// Train the second predictor on the same data.
var secondTrainer = new AveragedPerceptronTrainer(env, new AveragedPerceptronTrainer.Arguments());

var trainRoles = new RoleMappedData(trainData, label: "Label", feature: "Features");
var finalModel = secondTrainer.Train(new TrainContext(trainRoles, initialPredictor: firstModel.Model));

These APIs are still in flux, but perhaps this helps. This is not yet part of an official ML.NET release so you would need to get a NuGet from here or build the repo.

Note: I am on the ML.NET team.

Spur answered 24/9, 2018 at 0:35 Comment(3)
So in ver 0.5.0 we cant retrain model?Contingence
Not with the LearningPipeline APIs. ML.NET 0.6 (coming soon) will enable this scenario.Spur
We're now at version 1.1.0... can you confirm if this is available? (and how it would be done)Pulchia

© 2022 - 2024 — McMap. All rights reserved.