The scikit-learn package provides the functions Lasso()
and LassoCV()
but no option to fit a logistic function instead of a linear one...How to perform logistic lasso in python?
The Lasso optimizes a least-square problem with a L1 penalty. By definition you can't optimize a logistic function with the Lasso.
If you want to optimize a logistic function with a L1 penalty, you can use the LogisticRegression
estimator with the L1 penalty:
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
X, y = load_iris(return_X_y=True)
log = LogisticRegression(penalty='l1', solver='liblinear')
log.fit(X, y)
Note that only the LIBLINEAR and SAGA (added in v0.19) solvers handle the L1 penalty.
Lasso
class only includes least-square. Other classes include L1 regularization (LogisticRegression
, NMF
, ...), but it is called "L1 regularization", and not "Lasso". –
Velma You can use glmnet
in Python. Glmnet uses warm starts and active-set convergence so it is extremely efficient. Those techniques make glmnet
faster than other lasso implementations. You can download it from https://web.stanford.edu/~hastie/glmnet_python/
1 scikit-learn: sklearn.linear_model.LogisticRegression
sklearn.linear_model.LogisticRegression
from scikit-learn is probably the best:
as @TomDLT said, Lasso
is for the least squares (regression) case, not logistic (classification).
from sklearn.linear_model import LogisticRegression
model = LogisticRegression(
penalty='l1',
solver='saga', # or 'liblinear'
C=regularization_strength)
model.fit(x, y)
2 python-glmnet: glmnet.LogitNet
You can also use Civis Analytics' python-glmnet library. This implements the scikit-learn BaseEstimator
API:
# source: https://github.com/civisanalytics/python-glmnet#regularized-logistic-regression
from glmnet import LogitNet
m = LogitNet(
alpha=1, # 0 <= alpha <= 1, 0 for ridge, 1 for lasso
)
m = m.fit(x, y)
I'm not sure how to adjust the penalty with LogitNet
, but I'll let you figure that out.
3 other
PyMC
you can also take a fully bayesian approach. rather than use L1-penalized optimization to find a point estimate for your coefficients, you can approximate the distribution of your coefficients given your data. this gives you the same answer as L1-penalized maximum likelihood estimation if you use a Laplace prior for your coefficients. the Laplace prior induces sparsity.
the PyMC folks have a tutorial here on setting something like that up. good luck.
© 2022 - 2024 — McMap. All rights reserved.