How to perform logistic lasso in python?
Asked Answered
H

3

16

The scikit-learn package provides the functions Lasso() and LassoCV() but no option to fit a logistic function instead of a linear one...How to perform logistic lasso in python?

Horseleech answered 13/1, 2017 at 16:47 Comment(1)
I still have no answer to it. I ended up performing this analysis in R using the package glmnet.Horseleech
V
30

The Lasso optimizes a least-square problem with a L1 penalty. By definition you can't optimize a logistic function with the Lasso.

If you want to optimize a logistic function with a L1 penalty, you can use the LogisticRegression estimator with the L1 penalty:

from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
X, y = load_iris(return_X_y=True)
log = LogisticRegression(penalty='l1', solver='liblinear')
log.fit(X, y)

Note that only the LIBLINEAR and SAGA (added in v0.19) solvers handle the L1 penalty.

Velma answered 16/1, 2017 at 17:39 Comment(3)
lasso isn't only used with least square problems. any likelihood penalty (L1 or L2) can be used with any likelihood-formulated model, which includes any generalized linear model modeled with an exponential family likelihood function, which includes logistic regression.Masochism
Agreed. Originally defined for least squares, Lasso regularization is easily extended to a wide variety of statistical models. In scikit-learn though, the Lasso class only includes least-square. Other classes include L1 regularization (LogisticRegression, NMF, ...), but it is called "L1 regularization", and not "Lasso".Velma
ah ok. i thought you were referring to lasso generally.Masochism
G
3

You can use glmnet in Python. Glmnet uses warm starts and active-set convergence so it is extremely efficient. Those techniques make glmnet faster than other lasso implementations. You can download it from https://web.stanford.edu/~hastie/glmnet_python/

Giannagianni answered 24/1, 2020 at 18:10 Comment(0)
M
1

1 scikit-learn: sklearn.linear_model.LogisticRegression

sklearn.linear_model.LogisticRegression from scikit-learn is probably the best:

as @TomDLT said, Lasso is for the least squares (regression) case, not logistic (classification).

from sklearn.linear_model import LogisticRegression

model = LogisticRegression(
    penalty='l1',
    solver='saga',  # or 'liblinear'
    C=regularization_strength)

model.fit(x, y)

2 python-glmnet: glmnet.LogitNet

You can also use Civis Analytics' python-glmnet library. This implements the scikit-learn BaseEstimator API:

# source: https://github.com/civisanalytics/python-glmnet#regularized-logistic-regression

from glmnet import LogitNet

m = LogitNet(
    alpha=1,  # 0 <= alpha <= 1, 0 for ridge, 1 for lasso
)
m = m.fit(x, y)

I'm not sure how to adjust the penalty with LogitNet, but I'll let you figure that out.

3 other

PyMC

you can also take a fully bayesian approach. rather than use L1-penalized optimization to find a point estimate for your coefficients, you can approximate the distribution of your coefficients given your data. this gives you the same answer as L1-penalized maximum likelihood estimation if you use a Laplace prior for your coefficients. the Laplace prior induces sparsity.

the PyMC folks have a tutorial here on setting something like that up. good luck.

Masochism answered 21/4, 2020 at 16:45 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.