python least squares regression modification to objective function
Asked Answered
C

0

6

Least squares regression is defined as the minimization of the sum of squared residuals e.g.

Minimize(sum_squares(X * beta - y))

However, I'd like to propose a slight modification such that we are still minimizing the following

Minimize(sum_modified_squares(X*beta - y))

where sum_modified_squares(X*beta - y) = 0 if sign(X*beta) == sign(y)
else sum_modified_squares(X*beta - y) = sum_squares(X*beta - y)

Basically I want to only penalize when the sign of the prediction is not equal to the sign of the actual y. Is there any literature on this or implementations? I'm trying to implement in cvxpy but am not sure how to do it

Crematory answered 20/8, 2017 at 18:17 Comment(8)
I told you in your earlier question that this is imho not possible within cvxpy as it's not convex, therefore can't be formulated. The only working approach is to use Mixed-integer programming, which is possible too within cvxpy, but i mentioned the consequences. Why is nothing about this in your question? And if you don't have licenses for Gurobi, CPLEX, or Mosek don't do it within cvxpy (or use the cbc-interface which needs some extra setup-steps)!Aila
Sorry i thought I had posted this in the stats sub forum - how exactly do you think it can be done in cvxpy with the mixed integer/Boolean programming? I do have the mosek license so I'm not opposed to going down that route I'm just not quite sure how mixed integer programming solves the problemCrematory
Indicator-variables (e.g. b_0 = 1 if some expr >= 0), Linearization of products of binary-variables (e.g. b_0==1 ^ b_1==1 is equiv to b_0 * b_1; edit probably not needed as a simple indicator-constraint on a shifted sum works too) and some customized big-M expressions to construct implications. Read any integer-programming guide (and search for keywords like indicator-variables). It will be some work, but it's not hard. If it's nice to solve though is another question. (i think mosek's interface got some work; but in the past MIP-usage was not possible)Aila
This is all you need i suppose.Aila
Thanks! Very much appreciatedCrematory
Quite frankly, this optimization is a little odd, as it is not only non-convex, it is not even continuous. Imagine a beta that for certain i, X[i, :]*beta close to zero but very far from y[i], then a small change in beta causing X[i, :]*beta flip sign would make the difference between no penalty and big penalty. I know this doesn't solve your problem, but I feel this is the root cause that makes the problem "unsolvable", at least in a numerically robust sense.Oaten
This might also be helpful en.wikipedia.org/wiki/Hinge_loss.Cozen
out of curiosity: why don't you code negative values as 0, positive as 1 and solve logistic regression?Scala

© 2022 - 2024 — McMap. All rights reserved.