I have written programs in C++, Python and Java for matrix multiplication and tested their speed for multiplying two 2000 x 2000 matrices (see post). The standard ikj-implentation - which is in - took:
Now I have implemented the Strassen algorithm for matrix multiplication - which is in - in Python and C++ as it was on wikipedia. These are the times I've got:
Why is Strassen matrix multiplication so much slower than standard matrix multiplication?
Ideas:
- Some cache effects
- Implementation:
- error (the resulting 2000 x 2000 matrix is correct)
- null-multiplication (shouldn't be that important for 2000 x 2000 -> 2048 x 2048)
This is especially astonishing, as it seems to contradict the experiences of others:
- Why is my Strassen Matrix multiplier so fast?
- Matrix multiplication: Strassen vs. Standard - Strassen was also slower for him, but it was at least in the same order of magnitude.
edit: The reason why Strassen matrix multiplication was slower in my case, were:
- I made it fully recursive (see tam)
- I had two functions
strassen
andstrassenRecursive
. The first one resized the matrix to a power of two, if required and called the second one. ButstrassenRecursive
didn't recursively call itself, butstrassen
.