If we want to search for the optimal parameters theta for a linear regression model by using the normal equation with:
theta = inv(X^T * X) * X^T * y
one step is to calculate inv(X^T*X). Therefore numpy provides np.linalg.inv() and np.linalg.pinv()
Though this leads to different results:
X=np.matrix([[1,2104,5,1,45],[1,1416,3,2,40],[1,1534,3,2,30],[1,852,2,1,36]])
y=np.matrix([[460],[232],[315],[178]])
XT=X.T
XTX=XT@X
pinv=np.linalg.pinv(XTX)
theta_pinv=(pinv@XT)@y
print(theta_pinv)
[[188.40031946]
[ 0.3866255 ]
[-56.13824955]
[-92.9672536 ]
[ -3.73781915]]
inv=np.linalg.inv(XTX)
theta_inv=(inv@XT)@y
print(theta_inv)
[[-648.7890625 ]
[ 0.79418945]
[-110.09375 ]
[ -74.0703125 ]
[ -3.69091797]]
The first output, that is the output of pinv is the correct one and additionally recommended in the numpy.linalg.pinv() docs. But why is this and where are the differences / Pros / Cons between inv() and pinv().
solve
withX.T @ X
or lstsq withX
as everyone else says; if you were intent on usingpinv
, the better way to do it would be:theta_inv = np.linalg.pinv(X) @ y
which will still produce the same answer as your first calculation, and the same answer asnp.solve((X.T @ X), X.T @ y)
andnp.linalg.lstsq(X, y)
. The large error in your second calculation is a great example of why not to directly calculate the inverse when solving. – Revengenp.linalg.solve
will not produce the same answer – Revengetheta_pinv = np.pinv(X) @ Y
andtheta_inv = np.inv(X.T @ X) @ X.T @ Y
. – Bichromate