Difference between numpy.linalg.lstsq and sklearn.linear_model.LinearRegression
Asked Answered
Z

1

8

As I understand, numpy.linalg.lstsq and sklearn.linear_model.LinearRegression both look for solutions x of the linear system Ax = y, that minimise the resdidual sum ||Ax - y||.

But they don't give the same result:

from sklearn import linear_model
import numpy as np

A = np.array([[1, 0], [0, 1]])
b = np.array([1, 0])
x , _, _, _ = np.linalg.lstsq(A,b)
x

Out[1]: array([ 1.,  0.])

clf = linear_model.LinearRegression()
clf.fit(A, b)                              
coef = clf.coef_
coef

Out[2]: array([ 0.5, -0.5])

What am I overlooking?

Zermatt answered 12/4, 2016 at 12:18 Comment(3)
As @cel nooted the only difference is the intercept. You can do linear_model.LinearRegression(fit_intercept=False) to get the same result as np.linalg.lstsq.Quintillion
A simple, but easy to overlook detail. Thanks!Zermatt
If one of you would post the answer, I could mark this question answered.Zermatt
G
9

Both of them are implemented by LPACK gelsd.

The difference is that linear_model.LinearRegression will do data pre-process (default) as below for input X (your A). But np.linalg.lstsq don't. You can refer to the source code of LinearRegression for more details about the data pre-process.

X = (X - X_offset) / X_scale

If you don't want the data pre-process, you should set fit_intercept=False.

Briefly speaking, if you normalize your input before linear regression, you will get the same result by both linear_model.LinearRegression and np.linalg.lstsq as below.

# Normalization/Scaling
from sklearn.preprocessing import StandardScaler
A = np.array([[1, 0], [0, 1]])
X_scaler = StandardScaler()
A = X_scaler.fit_transform(A)

Now A is array([[ 1., -1.],[-1., 1.]])

from sklearn import linear_model
import numpy as np

b = np.array([1, 0])
x , _, _, _ = np.linalg.lstsq(A,b)
x
Out[1]: array([ 0.25, -0.25])

clf = linear_model.LinearRegression()
clf.fit(A, b)                              
coef = clf.coef_
coef

Out[2]: array([ 0.25, -0.25])
Gwenni answered 14/11, 2016 at 7:20 Comment(1)
As of scikit-learn 1.4, X is no longer divided by X_scale. fit_intercept=True still causes centering, but not scaling. See PR #27855.Melvinmelvina

© 2022 - 2024 — McMap. All rights reserved.