Python: l2-Penalty for logistic regression model from statsmodels?
Asked Answered
S

2

10

Is there a way to put an l2-Penalty for the logistic regression model in statsmodel through a parameter or something else? I just found the l1-Penalty in the docs but nothing for the l2-Penalty.

Stoltz answered 21/5, 2016 at 6:44 Comment(4)
Can you provide a link or address to the documentation you've found?Grisons
I'm referring to this model from statsmodels: statsmodels.sourceforge.net/0.6.0/generated/… To fit the model with regularization, you probably can use this method:statsmodels.sourceforge.net/0.6.0/generated/… For the parameter method, I just found the options ` ‘l1’ or ‘l1_cvxopt_cp’`, that both are probably the options I'm searching forStoltz
Wow! That documentation is not clear. I'm sorry, I can't make any definitive sense of it. I've written solvers that handle l2. It's generally easier to solve with an l2 penalty as the cost functions are everywhere differentiable thus the gradient exists. I have to assume it's there somewhere. Wish I could help more.Grisons
Sorry, in my first comment I meant of course that both options are NOT the ones I'm looking for. I'm not that much into the details of realizing a penalty in the logistic regression model therefore I'm just looking for an easy option to choose between the option to turn the l2-penalty on and off like in the scikit package. Unfortunately in this package I don't have such a nice summary of the logistic regression with all the p values and stuffStoltz
P
4

The models in statsmodels.discrete like Logit, Poisson and MNLogit have currently only L1 penalization. However, elastic net for GLM and a few other models has recently been merged into statsmodels master.

GLM with family binomial with a binary response is the same model as discrete.Logit although the implementation differs. See my answer for L2 penalization in Is ridge binomial regression available in Python?

What has not yet been merged into statsmodels is L2 penalization with a structured penalization matrix as it is for example used as roughness penality in generalized additive models, GAM, and spline fitting.

Paraphrast answered 28/5, 2016 at 13:19 Comment(0)
C
3

If you look closely at the Documentation for statsmodels.regression.linear_model.OLS.fit_regularized you'll see that the current version of statsmodels allows for Elastic Net regularization which is basically just a convex combination of the L1- and L2-penalties (though more robust implementations employ some post-processing to diminish undesired behaviors of the naive implementations, see "Elastic Net" on Wikipedia for details):

ElasticNet2

If you take a look at the parameters for fit_regularized in the documentation:

OLS.fit_regularized(method='elastic_net', alpha=0.0, L1_wt=1.0, start_params=None, profile_scale=False, refit=False, **kwargs)

you'll realize that L1_wt is just lambda_1 in the first equation. So to get the L2-Penalty you're looking for, you just pass L1_wt=0 as an argument when you call the function. As an example:

model = sm.OLS(y, X)
results = model.fit_regularized(method='elastic_net', alpha=1.0, L1_wt=0.0)
print(results.summary())

should give you an L2 Penalized Regression predicting target y from input X.

Three final comments:

  1. statsmodels currently only implements elastic_net as an option to the method argument. So that gives you L1 and L2 and any linear combination of them but nothing else (for OLS at least);

  2. L1 Penalized Regression = LASSO (least absolute shrinkage and selection operator);

  3. L2 Penalized Regression = Ridge Regression, the Tikhonov–Miller method, the Phillips–Twomey method, the constrained linear inversion method, and the method of linear regularization.

Concert answered 5/6, 2017 at 21:32 Comment(4)
Cool, but this is OLS, not logistic regression.Barbican
For regularized logistic regression: m = statsmodels.genmod.generalized_linear_model.GLM(y, X, family=families.Binomial(link=links.Logit)) ; m.fit_regularized(...)?Barbican
So... in your example, does L2 = alpha when L1_wt=0? (alpha = L1+L2 ... does this work if we want to set L1=0, L2=10)?Barbican
When I try logistic_regression_model = sm.GLM( y, X, link=sm.genmod.families.links.logit) results = logistic_regression_model.fit_regularized(alpha=1.) results.summary(), I get an error on results.summary() that summary is not implemented. Why does my fit_regularized() not return a summary?Laris

© 2022 - 2024 — McMap. All rights reserved.