In mathematics, I think the dot in numpy makes more sense
dot(a,b)_{i,j,k,a,b,c} =
since it gives the dot product when a and b are vectors, or the matrix multiplication when a and b are matrices
As for matmul operation in numpy, it consists of parts of dot result, and it can be defined as
matmul(a,b)_{i,j,k,c} =
So, you can see that matmul(a,b) returns an array with a small shape,
which has smaller memory consumption and make more sense in applications.
In particular, combining with broadcasting, you can get
matmul(a,b)_{i,j,k,l} =
for example.
From the above two definitions, you can see the requirements to use those two operations. Assume a.shape=(s1,s2,s3,s4) and b.shape=(t1,t2,t3,t4)
- t3=s4;
- To use matmul(a,b) you need
- t3=s4
- t2=s2, or one of t2 and s2 is 1
- t1=s1, or one of t1 and s1 is 1
Use the following piece of code to convince yourself.
import numpy as np
for it in range(10000):
a = np.random.rand(5,6,2,4)
b = np.random.rand(6,4,3)
c = np.matmul(a,b)
d = np.dot(a,b)
#print ('c shape: ', c.shape,'d shape:', d.shape)
for i in range(5):
for j in range(6):
for k in range(2):
for l in range(3):
if c[i,j,k,l] != d[i,j,k,j,l]:
print (it,i,j,k,l,c[i,j,k,l]==d[i,j,k,j,l]) # you will not see them
matmul
function years ago?@
as an infix operator is new, but the function works just as well without it. – Fresno