Why is numpy.linalg.pinv() preferred over numpy.linalg.inv() for creating inverse of a matrix in linear regression
Asked Answered
B

3

29

If we want to search for the optimal parameters theta for a linear regression model by using the normal equation with:

theta = inv(X^T * X) * X^T * y

one step is to calculate inv(X^T*X). Therefore numpy provides np.linalg.inv() and np.linalg.pinv()

Though this leads to different results:

X=np.matrix([[1,2104,5,1,45],[1,1416,3,2,40],[1,1534,3,2,30],[1,852,2,1,36]])
y=np.matrix([[460],[232],[315],[178]])

XT=X.T
XTX=XT@X

pinv=np.linalg.pinv(XTX)
theta_pinv=(pinv@XT)@y
print(theta_pinv)

[[188.40031946]
 [  0.3866255 ]
 [-56.13824955]
 [-92.9672536 ]
 [ -3.73781915]]

inv=np.linalg.inv(XTX)
theta_inv=(inv@XT)@y
print(theta_inv)

[[-648.7890625 ]
 [   0.79418945]
 [-110.09375   ]
 [ -74.0703125 ]
 [  -3.69091797]]

The first output, that is the output of pinv is the correct one and additionally recommended in the numpy.linalg.pinv() docs. But why is this and where are the differences / Pros / Cons between inv() and pinv().

Blinkers answered 19/3, 2018 at 7:4 Comment(5)
Your X matrix is 4x5 and so has rank at most 4; therefore X'*X which is 5x5 has rank at most 4 and is not invertible. Rounding error may mean that a routine calculates an 'inverse' but it won't be the inverse, as there isn't one.Conaway
Just to be clear, neither is recommended: we should use np.linalg.lstsq for this purpose, which avoids explicitly computing any new matrix.Evalyn
This is not meant as an answer to your question, but it seems to me like you are misusing pinv slightly. While it's best to use solve with X.T @ X or lstsq with X as everyone else says; if you were intent on using pinv, the better way to do it would be: theta_inv = np.linalg.pinv(X) @ y which will still produce the same answer as your first calculation, and the same answer as np.solve((X.T @ X), X.T @ y) and np.linalg.lstsq(X, y). The large error in your second calculation is a great example of why not to directly calculate the inverse when solving.Revenge
My mistake, np.linalg.solve will not produce the same answerRevenge
The math is wrong. It is theta_pinv = np.pinv(X) @ Y and theta_inv = np.inv(X.T @ X) @ X.T @ Y.Bichromate
W
34

If the determinant of the matrix is zero it will not have an inverse and your inv function will not work. This usually happens if your matrix is singular.

But pinv will. This is because pinv returns the inverse of your matrix when it is available and the pseudo inverse when it isn't.

The different results of the functions are because of rounding errors in floating point arithmetic

You can read more about how pseudo inverse works here

Wolfenbarger answered 19/3, 2018 at 7:21 Comment(6)
Can you explain why pinv and inv results are comparable?Borowski
I don't see why it would not. Just because it is a different approach to get an answer doesn't mean it will be different.Wolfenbarger
Yes, but can one also demonstrate that. Does it imply it in documentation, now it's just your opinion.Borowski
I really don't want to be unhelpful. But please check out the formula for inverse of a matrix and moore- Penrose inverse. It's actually a math concept.Wolfenbarger
I really don't want to be unhelpful. But please check out the formula for inverse of a matrix and moore- Penrose inverse. It's actually a math concept.Wolfenbarger
Might be worth pointing out that, in the case that it's possible to calculate the matrix inverse, the (Moore-Penrose) pseudo-inverse is (mathematically) equivalent. This isn't an issue of code documentation, it's a mathematical construct (in the same way that a matrix inverse is a mathematical construct).Orlene
U
13

inv and pinv are used to compute the (pseudo)-inverse as a standalone matrix. Not to actually use them in the computations.

For such linear system solutions the proper tool to use is numpy.linalg.lstsq (or from scipy) if you have a non invertible coefficient matrix or numpy.linalg.solve (or from scipy) for invertible matrices.

Upbear answered 19/3, 2018 at 13:55 Comment(1)
Are inv and pinv "comparable" (i.e. do they produce closely the same result, when used as part of computation).Borowski
C
2

Inverting X^T X is not as numerically stable as computing the pseudo-inverse using numpy's function when X^T X has relatively very small singular values, which can occur when a matrix is "almost" rank-deficient but isn't due to noise. This occurs in particular when the condition number, that is the ratio of max sv to min sv, is large.

Christenachristendom answered 23/4, 2022 at 22:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.