Elastic net regression or lasso regression with weighted samples (sklearn)

About

Asked 3/10, 2017 at 4:4 Answered 3/10, 2017 at 11:27

Solved python scikit-learn regression linear-regression

Scikit-learn allows sample weights to be provided to linear, logistic, and ridge regressions (among others), but not to elastic net or lasso regressions. By sample weights, I mean each element of the input to fit on (and the corresponding output) is of varying importance, and should have an effect on the estimated coefficients proportional to its weight.

Is there a way I can manipulate my data before passing it to ElasticNet.fit() to incorporate my sample weights?

If not, is there a fundamental reason it is not possible?

Thanks!

Personage answered 3/10, 2017 at 4:4 Comment(2)

This code is presented by someone at Stanford, who works with Trevor Hastie (one of the main authors of elastic net). It does support weights and it's Python. If you take a look at this vignette, at the first equation, I think that you can see how to manipulate the data to inject weights in your scikit-learn package. Just make sure that the average weight of each weight is 1 so that any preset limits to the grid of of lambda values remain OK. – Pickaninny 5/7, 2019 at 12:4

I should have said that you can see how to apply weighting on your own by inspecting the first two equations of the vignette. – Pickaninny 6/7, 2019 at 6:47

You can read some discussion about this in sklearn's issue-tracker.

It basically reads like:

not that hard to do (theory-wise)
pain keeping all the basic sklearn'APIs and supporting all possible cases (dense vs. sparse)

As you can see in this thread and the linked one about adaptive lasso, there is not much activity there (probably because not many people care and the related paper is not popular enough; but that's only a guess).

Depending on your exact task (size? sparseness?), you could build your own optimizer quite easily based on scipy.optimize, supporting this kind of sample-weights (which will be a bit slower, but robust and precise)!

Stronski answered 3/10, 2017 at 11:27 Comment(6)

Hi, assuming that I want to use lasso or elasticNet with a dense matrix, would there be anything wrong in scaling the values input matrices by the square of the sample weights? (The same way it is done when you map a weighted-OLS problem into a standard OLS). – Pediatrician 26/3, 2019 at 8:57

@Pediatrician Yes, if you would pass X = dot(sqrt(diag(weights)), X) and y = dot(sqrt(diag(weights)), y) to the lasso or elasticNet that would be OK to take into account weights. Only problem is that the fit metric used during cross validation would need access to X, y AND your weights to properly calculate the out of sample weighted MSE, otherwise your regularisation parameter would not be tuned correctly... – Creamy 16/1, 2020 at 11:15

I find it increadible by the way that something basic as sample weights is currently not implemented in sklearn... – Creamy 16/1, 2020 at 11:24

@Tom Wenseleers: thanks for the answer, would you have a reference for me to understand what are you referring to with the "fit metric used during cross validation would need access to X, y AND your weights to properly calculate the out of sample weighted MSE"? – Pediatrician 17/1, 2020 at 12:51

@Pediatrician Well it's just that you would have to calculate the weighted root mean square error as (in R code this is) weighted.rmse <- function(actual, predicted, weight){ sqrt(sum((predicted-actual)^2*weight)/sum(weight)) } where predicted = X %*% coefficients. But these predicted values you can only calculate if the function would have access to the original X instead of sqrt(w) * X. So sample weights would really have to be properly supported by the scikit-learn API as otherwise you will have to write your own predict & CV functions I believe... No Python expert though... – Creamy 17/1, 2020 at 13:10

I think this is also correct btw if X is sparse. You might have to use argument intercept=False though, and maybe have to add a column of 1s yourself in case you have an intercept... – Creamy 17/1, 2020 at 13:12

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags