Numpy dot too clever about symmetric multiplications
Asked Answered
A

2

23

Anybody know about documentation for this behaviour?

import numpy as np
A  = np.random.uniform(0,1,(10,5))
w  = np.ones(5)
Aw = A*w
Sym1 = Aw.dot(Aw.T)
Sym2 = (A*w).dot((A*w).T)
diff = Sym1 - Sym2

diff.max() is near machine-precision non-zero, e.g. 4.4e-16.

This (the discrepancy from 0) is usually fine... in a finite-precision world we should not be surprised.

Moreover, I would guess that numpy is being smart about symmetric products, to save flops and ensure symmetric output...

But I deal with chaotic systems, and this small discrepancy quickly becomes noticeable when debugging. So I'd like to know exactly what's going on.

Aile answered 17/4, 2017 at 14:46 Comment(7)
Since your code will give varying output from run to run, please show a sample output and state more clearly what is undesirable about that output.Liv
Did you try enforcing the use of doubles (np.float64)?Pneumogastric
@TomdeGeus how ? Anyway, note that I don't really care that the difference is non-zero. I just want to have the behaviour (which clearly comes from numpy being clever) explained.Aile
You could use .astype(np.float64) on the definitions of A and w. BTW, according to NumPy the machine precision on my machine is print(np.finfo(np.float64).eps) = 2.2e-16. While diff.max() = 1.1e-16, i.e. within machine precision.Pneumogastric
Any difference if you use B=Aw.T.copy()?Qianaqibla
While this discrepancy is somewhat curious, I wonder in what non-trivial debugging test it could possibly matter, since any proper calculation incurs a comparable error anyway.Pilgrim
@Pilgrim yes, I concur. But as mentioned, in my testing, chaos works to grow the discrepancies "exponentially". I should be unit-testing at a lower level, but sometimes I'm too lazy. Anyways, I just wanted the explanation for the result, not to change it.Aile
C
28

This behaviour is the result of a change introduced for NumPy 1.11.0, in pull request #6932. From the release notes for 1.11.0:

Previously, gemm BLAS operations were used for all matrix products. Now, if the matrix product is between a matrix and its transpose, it will use syrk BLAS operations for a performance boost. This optimization has been extended to @, numpy.dot, numpy.inner, and numpy.matmul.

In the changes for that PR, one finds this comment:

/*
 * Use syrk if we have a case of a matrix times its transpose.
 * Otherwise, use gemm for all other cases.
 */

So NumPy is making an explicit check for the case of a matrix times its transpose, and calling a different underlying BLAS function in that case. As @hpaulj notes in a comment, such a check is cheap for NumPy, since a transposed 2d array is simply a view on the original array, with inverted shape and strides, so it suffices to check a few pieces of metadata on the arrays (rather than having to compare the actual array data).

Here's a slightly simpler case that shows the discrepancy. Note that using a .copy on one of the arguments to dot is enough to defeat NumPy's special-casing.

import numpy as np
random = np.random.RandomState(12345)
A = random.uniform(size=(10, 5))
Sym1 = A.dot(A.T)
Sym2 = A.dot(A.T.copy())
print(abs(Sym1 - Sym2).max())

I guess one advantage of this special-casing, beyond the obvious potential for speed-up, is that you're guaranteed (I'd hope, but in practice it'll depend on the BLAS implementation) to get a perfectly symmetric result when syrk is used, rather than a matrix which is merely symmetric up to numerical error. As an (admittedly not very good) test for this, I tried:

import numpy as np
random = np.random.RandomState(12345)
A = random.uniform(size=(100, 50))
Sym1 = A.dot(A.T)
Sym2 = A.dot(A.T.copy())
print("Sym1 symmetric: ", (Sym1 == Sym1.T).all())
print("Sym2 symmetric: ", (Sym2 == Sym2.T).all())

Results on my machine:

Sym1 symmetric:  True
Sym2 symmetric:  False
Cladding answered 17/4, 2017 at 15:28 Comment(2)
A transpose is a view with a flip in continuity (F contiguous) and strides. So it could be identified by just comparing a few attributes, without looking at any data (which would be expensive).Qianaqibla
Just discovered that this change is documented in the release notes, at least. Answer updated.Cladding
F
1

I suspect this is to do with promotion of intermediate floating point registers to 80 bit precision. Somewhat confirming this hypothesis is that if we use fewer floats we consistently get 0 in our results, ala

A  = np.random.uniform(0,1,(4,2))
w  = np.ones(2)
Aw = A*w
Sym1 = Aw.dot(Aw.T)
Sym2 = (A*w).dot((A*w).T)
diff = Sym1 - Sym2
# diff is all 0's (ymmv)
Fiche answered 17/4, 2017 at 15:11 Comment(3)
This is possible. But I think your experiment could also be explained by my suggestion: numpy treating symmetric products differently. How can we know for sure?Aile
We can know for sure this is not the reason, because python intermediate values will never store 80-bit precision, as they're stored in memory and not registers. The important thing here is that A*w is being computed twice, not whether its computed as part of a larger expressionDeci
@Patrick. The most reliable way would be to read the source code.Improvvisatore

© 2022 - 2024 — McMap. All rights reserved.