How to initialize coef_init and intercept_init for a new training model?
Asked Answered
V

1

1

As specified here, https://mcmap.net/q/332454/-save-progress-between-multiple-instances-of-partial_fit-in-python-sgdclassifier, i stored the coef and intercept of my first model. Later, i am passing them as initializers to my second fit() as shown below for learning new data on top of old model.

from sklearn import neighbors, linear_model
import numpy as np
import pickle
import os

def train_data():

    x1 = [[8, 9], [20, 22], [16, 18], [8,4]]
    y1 = [0, 1, 2, 3]

    #classes = np.arange(10)

    #sgd_clf = linear_model.SGDClassifier(learning_rate = 'constant', eta0 = 0.1, shuffle = False, n_iter = 1,warm_start=True)

    sgd_clf = linear_model.SGDClassifier(loss="hinge",max_iter=10000)

    sgd_clf.fit(x1,y1)

    coef = sgd_clf.coef_
    intercept = sgd_clf.intercept_

    return coef, intercept


def train_new_data(coefs,intercepts):

    x2 = [[18, 19],[234,897],[20, 122], [16, 118]]
    y2 = [4,5,6,7]

    sgd_clf1 = linear_model.SGDClassifier(loss="hinge",max_iter=10000)

    new_model = sgd_clf1.fit(x2,y2,coef_init=coefs,intercept_init=intercepts)

    return new_model


if __name__ == "__main__":

    coefs,intercepts= train_data()

    new_model = train_new_data(coefs,intercepts)

    print(new_model.predict([[16, 118]]))
    print(new_model.predict([[18, 19]]))
    print(new_model.predict([[8,9]]))
    print(new_model.predict([[20,22]]))

When i run this, i get the lables that are trained only from new_model. For instance, print(new_model.predict([[8,9]])) has to print label as 0 and print(new_model.predict([[20,22]])) has to print label as 1. But it prints lables matching from 4 to 7.

Am i passing the coef and intercepts from old model to the new one in wrong way ?

EDIT: Reframed the question as per @vital_dml answer

Vindicable answered 6/3, 2018 at 14:29 Comment(0)
Y
2

I'm not sure why you need to pass coefficients and intercept from 1st model to the 2nd, however, you are getting such error because your 1st model is trained against 4 classes y1 = [0, 1, 2, 3], while 2nd one has 2 classes y2 = [4,5], which is controversial.

According to scikit-learn documentation, your linear_model.SGDClassifier() returns:

coef_ : array, shape (1, n_features) if n_classes == 2 else (n_classes, n_features) - Weights assigned to the features.

intercept_ : array, shape (1,) if n_classes == 2 else (n_classes,) - Constants in decision function.

So, within your question, the number of classes and features in both models have to be the same.

Anyway, I encourage you to think do you really need to do that? Maybe you could just concatenate those vectors.

Ylem answered 6/3, 2018 at 16:10 Comment(5)
Hi Vital, i maintained the same classes in both models now. Please see the cahnges question. I want to pass coef and intercepts from old model to new one because i want load the old models results rather than retraining all of them while i get new dataVindicable
Hi, as you may get observations/classes you have never had before, it seems it relates to online learning methods. Your second model is just being initialized with coefficients of the first model, while it doesn't know anything about the classes those coefficients were trained against. So, you either want to concatenate each new observations with previous ones and train together against all the possible classes or study some online learning methods, briefly is described hereYlem
Hi, Thanks. Yes, i have gone through this answer in link before. Could you please explain how this load and retrain happens ? Saving the pickle file, loading and retraining doesnot help me much understading how to retrain after loading the file. Do you have a link to small prototype that shows how retrain is performed ? I see that another answer specifying backpropogaton method. If this is the case, may be i have to shift to MLPClassifier. ?Vindicable
Regarding classes, i tried with partial_fit() way like here. Here i know all the classes before. In current approach shown in link, i kept seperate for loop for each new instance. If i keep all the partial_fits() in single for loop, my old data is remembered. However, this is not i am expecting. Keeping all partial_fit() in single loop makes it batch leanring againVindicable
user in that answer specifies using tensorflow. Maybe i have to leave scikit-learn for retraining part ? I tried scikit neural network, MLPclassifier too. It also doesnot retrain in the way i am doing currently. Maybe i have to get the weights of the MLPClassifier. Do u have idea how i could load weights and send to next model ?Vindicable

© 2022 - 2024 — McMap. All rights reserved.