Is LASSO regression implemented in Statsmodels?
Asked Answered
M

2

14

I would love to use a linear LASSO regression within statsmodels, so to be able to use the 'formula' notation for writing the model, that would save me quite some coding time when working with many categorical variables, and their interactions. However, it seems like it is not implemented yet in stats models?

Machinate answered 17/4, 2017 at 7:11 Comment(0)
I
19

Lasso is indeed implemented in statsmodels. The documentation is given in the url below:

http://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.OLS.fit_regularized.html

To be precise, the implementation in statsmodel has both L1 and L2 regularization, with their relative weight indicated by L1_wt parameter. You should look at the formula at the bottom to make sure you are doing exactly what you want to do.

Besides the elastic net implementation, there is also a square root Lasso method implemented in statsmodels.

Irradiance answered 16/6, 2017 at 16:16 Comment(1)
Correct! I stopped when I read that statsmodels uses elastic net regularization. And now you made me realize that LASSO and ridge regularization are just special cases of elastic net regularization. Thanks!Machinate
M
0

One can use Patsy with scikit-learn to obtain the same results one would obtain with the formula notation in statsmodels. See code below:

from patsy import dmatrices

# create dummy variables, and their interactions
y, X = dmatrices('outcome ~ C(var1)*C(var2)', df, return_type="dataframe")
# flatten y into a 1-D array so scikit-learn can understand it
y = np.ravel(y)

and I can now use any model implemented in scikit-learn with the usual notations having X as independent variables, and y as dependent one.

Machinate answered 17/4, 2017 at 9:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.