Linear Regression vs Closed form Ordinary least squares in Python
Asked Answered
H

1

1

I am trying to apply Linear Regression method for a dataset of 9 sample with around 50 features using python. I have tried different methodology for Linear Regression i.e Closed form OLS(Ordinary Least Squares), LR(Linear Regression), HR(Huber Regression), NNLS( Non negative least squares) and each of them gives different weights.

But I can get the intuition why HR and NNLS has different solution, but LR and Closed form OLS have the same objective function of minimizing the sum of the squares of the differences between observed value in the given sample and those predicted by a linear function of a set of features. Since the training set is singular, i had to use pseudoinverse to perform Closed form OLS.

w = np.dot(train_features.T, train_features)  
w1 = np.dot(np.linalg.pinv(w), np.dot(train_features.T,train_target))

For LR i have used scikit-learn Linear Regression uses lapack library from www.netlib.org to solve the least-squares problem

       linear_model.LinearRegression()

System of linear equations or a system of polynomial equations is referred as underdetermined if no of equations available are less than unknown parameters. Each unknown parameter can be counted as an available degree of freedom. Each equation presented can be applied as a constraint that restricts one degree of freedom. As a result an underdetermined system can have infinitely many solutions or no solution at all. Since in our case study, system is underdetermined and also is singular, there exists many solutions.

Now both pseudoinverse and Lapack library tries to finds minimum norm solution of an underdetermined system when no of sample is less than no of features. Then why the closed form and LR gives completely different solution of the same system of linear equations. Am i missing something here which can explain the behaviors of both ways. Like if the peudoinverse is computed in different ways like SVD, QR/LQ factorization, can they produce different solution for same set of equations?

Herbalist answered 27/9, 2017 at 18:11 Comment(0)
D
0

Check out the docs of sklearn's LinearRegression again.

By default (like you call it), it also fits an intercept term!

Demo:

import numpy as np
from sklearn.datasets import load_boston
from sklearn.linear_model import LinearRegression

X, y = load_boston(return_X_y=True)

""" OLS custom """
w = np.dot(np.linalg.pinv(X), y)
print('custom')
print(w)

""" sklearn's LinearRegression (default) """
clf = LinearRegression()
print('sklearn default')
print(clf.fit(X, y).coef_)


""" sklearn's LinearRegression (no intercept-fitting) """
print('sklearn fit_intercept=False')
clf = LinearRegression(fit_intercept=False)
print(clf.fit(X, y).coef_)

Output:

custom
[ -9.16297843e-02   4.86751203e-02  -3.77930006e-03   2.85636751e+00
  -2.88077933e+00   5.92521432e+00  -7.22447929e-03  -9.67995240e-01
   1.70443393e-01  -9.38925373e-03  -3.92425680e-01   1.49832102e-02
  -4.16972624e-01]
sklearn default
[ -1.07170557e-01   4.63952195e-02   2.08602395e-02   2.68856140e+00
  -1.77957587e+01   3.80475246e+00   7.51061703e-04  -1.47575880e+00
   3.05655038e-01  -1.23293463e-02  -9.53463555e-01   9.39251272e-03
  -5.25466633e-01]
sklearn fit_intercept=False
[ -9.16297843e-02   4.86751203e-02  -3.77930006e-03   2.85636751e+00
  -2.88077933e+00   5.92521432e+00  -7.22447929e-03  -9.67995240e-01
   1.70443393e-01  -9.38925373e-03  -3.92425680e-01   1.49832102e-02
  -4.16972624e-01]
Demurrer answered 28/9, 2017 at 17:25 Comment(2)
Thanks a lot, I was thinking that Normal equation also has one $W_o$ or bias value in its weight vector which will be the same value as intercept value in linear regression. I know its a naive question, but when one should use fit_intercept or not? For two cases the weights are very different, i am trying to interpret the coefficients. I saw in the source that coefficients are scaled but what is the purpose?Herbalist
Intercept is just increasing model-capabilities and is usually a good idea. I'm not sure what you mean by scaling. Normalization is off by default, but normalization is one of the most important things for nearly all ML-algorithms, especially for linear-regression. That's basic ML-stuff and every course will explain that.Demurrer

© 2022 - 2024 — McMap. All rights reserved.