Model persistence in Scikit-Learn?
Asked Answered
C

1

6

I am trying to save and load scikit-learn model but facing issues when the save and load are happening on different python versions. Here what I have tried:

  1. Using pickle to save a model in python3 and deserialize in python2.This works for some of the models like LR,SVM but it fails for KNN.

    >>> pickle.load(open("inPy3.pkl", 'rb')) #KNN model
    ValueError: non-string names in Numpy dtype unpickling
    
  2. Also , I tried to serialize and deserialize in json using jsonpickle but getting the following error.

    data = jsonpickle.encode(lr) #lr = logisticRegression Model
    jsonpickle.decode(data)
    AttributeError: 'dict' object has no attribute '__name__'
    

Also, I want to know if there is some utility which I can use to serialize and deserialize scikit-learn model objects to human readable format (json,xml,protobuf etc).

Cornu answered 12/7, 2016 at 5:49 Comment(2)
I suspect this may be an issue with the pickling protocol you use. docs.python.org/3/library/pickle.html#pickle-protocols If you are going to pickle something in python 3 and need to use it in Python 2, use protocol=2 keyword argument in the pickle.dump method, which is the highest protocol understood by pickle in Python 2.Armin
@Armin I tried this but getting the same error. In python3:: pickle.dump(neigh, open("knn_ser_py3.pkl", 'wb'), protocol=2, fix_imports=True) , In python2:: reconstructed = pickle.load(open("knn_ser_py3.pkl", 'rb')) ValueError: non-string names in Numpy dtype unpicklingCornu
O
2

Instead of pickling whole models, you can extract and store their coefficients. Then load coefficients and init models with them.

Related to sklearn upgrade question. Similar approach will be valid for python versions.

Ozalid answered 20/12, 2022 at 20:58 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.