Is there a method in numpy for calculating the Mean Squared Error between two matrices?
I've tried searching but found none. Is it under a different name?
If there isn't, how do you overcome this? Do you write it yourself or use a different lib?
Is there a method in numpy for calculating the Mean Squared Error between two matrices?
I've tried searching but found none. Is it under a different name?
If there isn't, how do you overcome this? Do you write it yourself or use a different lib?
You can use:
mse = ((A - B)**2).mean(axis=ax)
Or
mse = (np.square(A - B)).mean(axis=ax)
ax=0
the average is performed along the row, for each column, returning an arrayax=1
the average is performed along the column, for each row, returning an arrayax=None
) the average is performed element-wise along the array, returning a scalar valuea = numpy.matrix([[5, 5], [5, 5]])
and then a ** 2
. The result is the numpy matrix matrix([[50, 50], [50, 50]])
, which shows that numpy matrix multiplication will not be element-wise. –
Visit np.ndarray
will do an element-wise multiplication for a**2
, but using a np.matrixlib.defmatrix.matrix
will do a matrix multiplication for a**2
... –
Weissmann Acmp = np.array(A, dtype=int)
) –
Merline np.nanmean(((A - B) ** 2))
if missing values –
Mameluke This isn't part of numpy
, but it will work with numpy.ndarray
objects. A numpy.matrix
can be converted to a numpy.ndarray
and a numpy.ndarray
can be converted to a numpy.matrix
.
from sklearn.metrics import mean_squared_error
mse = mean_squared_error(A, B)
See Scikit Learn mean_squared_error for documentation on how to control axis.
Even more numpy
np.square(np.subtract(A, B)).mean()
Just for kicks
mse = (np.linalg.norm(A-B)**2)/len(A)
Another alternative to the accepted answer that avoids any issues with matrix multiplication:
def MSE(Y, YH):
return np.square(Y - YH).mean()
From the documents for np.square
:
Return the element-wise square of the input.
The standard numpy methods for calculation mean squared error (variance) and its square root (standard deviation) are numpy.var()
and numpy.std()
, see here and here. They apply to matrices and have the same syntax as numpy.mean()
.
I suppose that the question and the preceding answers might have been posted before these functions became available.
Remarks on statistics
To answer the comment made by @Drew :
This answer is equivalent to the top answers in this thread. Technically, MSE differs from variance in that it uses "true" value of the parameter, rather than its estimate, see
What's the difference between the variance and the mean squared error? and What is the Difference between Variance and MSE?. The two quantities then differ by the bias of our estimate of the central parameter. However, when calculating sample variance, as is done in the OP, we cannot really know the value of this parameter. I believe the OP uses term MSE in a loose sense.
Furthermore, the numpy functions proposed above allow for parameter ddof
(the number of degrees of freedom), which allows to obtain unbiased variance estimates (contrary to what is claimed in some superficial comparisons between python and R.)
What about this to keep with the np.operation
style?
mse = np.mean(np.square(A - B))
Just keep in mind that np.mean()
with no axis
keyword argument specified will output a scalar, just like np.sum()
.
© 2022 - 2024 — McMap. All rights reserved.
((A - B) ** 2).mean(axis=ax)
, whereax=0
is per-column,ax=1
is per-row andax=None
gives a grand total. – Norty