Hyperparameter in Voting classifier
Asked Answered
K

2

7

So, I have a classifier which looks like

clf = VotingClassifier(estimators=[ 
        ('nn', MLPClassifier()), 
        ('gboost', GradientBoostingClassifier()),
        ('lr', LogisticRegression()),

        ], voting='soft')

And I want to essentially tune the hyperparameters of each of the estimators.

Is there a way to tune these "combinations" of classifiers? Thanks

Kiloton answered 5/10, 2017 at 7:31 Comment(0)
T
13

You can do this using GridSearchCV but with a little modification. In the parameters dictionary instead of specifying the attrbute directly, you need to use the key for classfier in the VotingClassfier object followed by __ and then the attribute itself.

Check out this example

from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.ensemble import VotingClassifier
from sklearn.model_selection import GridSearchCV

X = np.array([[-1.0, -1.0], [-1.2, -1.4], [-3.4, -2.2], [1.1, 1.2],[-1.0, -1.0], [-1.2, -1.4], [-3.4, -2.2], [1.1, 1.2]])
y = np.array([1, 1, 2, 2,1, 1, 2, 2])

eclf = VotingClassifier(estimators=[ 
    ('svm', SVC(probability=True)),
    ('lr', LogisticRegression()),
    ], voting='soft')

#Use the key for the classifier followed by __ and the attribute
params = {'lr__C': [1.0, 100.0],
      'svm__C': [2,3,4],}

grid = GridSearchCV(estimator=eclf, param_grid=params, cv=2)

grid.fit(X,y)

print (grid.best_params_)
#{'lr__C': 1.0, 'svm__C': 2}
Tranche answered 5/10, 2017 at 8:36 Comment(2)
Does param_grid tries all possible combination of keys between [1,100] to [2,3,4] like 300 combinations and gives best result? or is there any other meaning for param_grid ?Godevil
@Godevil It's not between 1 to 100. It's 1 and 100. So there are only 2x3=6 combinations only. Out of these 6 combinations, it gives the best resultRelation
K
2

use GridSearchCV

clf = VotingClassifier(
          estimators=[('lr',LogisticRegression()), ('gboost',GradientBoostingClassifier()),]
          , voting='soft')
#put the combination of parameters here 
p = [{'lr__C':[1,2],'gboost__n_estimator':[10,20]}]

grid = GridSearchCV(clf,p,cv=5,scoring='neg_log_loss')
grid.fit(X_train,Y_train)
Komatik answered 5/10, 2017 at 13:58 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.