How can I vectorize approximate matrix multiplication using sum of outer products?

Asked 2/3, 2022 at 11:6 Answered 10/3, 2022 at 12:8

Assume I have two matrices, A and B, and I want to compute C = AB using the sum of outer products. I have written this function to achieve that, but I am wondering If can eliminate the for loop and vectorize it,

import numpy as np

def mul_opx(A, B, pd):
    # Approx. matrix multiplication using outer product
    n, m = A.shape
    p = B.shape[1]
    C = np.zeros((n,p), dtype=A.dtype)
    dum = np.zeros_like(C)
    for t in range(m):
        dum = np.outer(A[:,t],B[t,:]) / pd[t]
        C = C + dum
    C = C / m
    return C

d = 1000
A = np.arange(d**2).reshape((d,d))
B = np.arange(d**2).reshape((d,d))

# Full Matrix Multiplication
C = A @ B

# Approximate Matrix Multiplication
# choosing half random vectors/rows from A/B
k = np.random.choice(d, int(d/2))
Ap = A[:,k]
Bp = B[k,:]

# Unifrom probability vector
pd_uniform = np.full(d,1/d)

# Approximate product
C_hat = mul_opx(Ap,Bp, pd_uniform[k])

This type of product is useful when matrix dimensions are very large say 10^6 x 10^6

Spaak answered 2/3, 2022 at 11:6 Comment(16)

A @ B works fine. – Edlin 2/3, 2022 at 11:10

The trick of having fast Python code is that you don't write it. You use libraries instead which are hopefully not written in Python. So consider trying A @ B as suggested already, and if it fails you, a next thing to check some Swiss-army knife, like einsum – Dyer 2/3, 2022 at 11:36

Yes but I am actually trying to compute an approximation and will not use all of the outer products in calculating my result. – Spaak 2/3, 2022 at 11:36

@AtifAli - Please include a minimal reproducible example of your actual problem in your question. Sampling rows and columns to approximate can be done with numpy but sounds like a really bad approach. – Edlin 2/3, 2022 at 11:42

sorry for this question here what is the name of this operation @ – Frenchman 2/3, 2022 at 11:53

You are using matmul/@ in the loop. Just from reading code it isn't obvious how this is any different from using it on the whole arrays, A@B. dot/matmul/einsum are the 'vectorized' sum-of-products functions. – Cygnus 2/3, 2022 at 15:32

@RabeeQasem, @ is the operator for np.matmul. That's an extension of np.dot. – Cygnus 2/3, 2022 at 18:39

For this small example, your function is 10x slower than the A@B. So A@B is the "vectorization". For your question to be meaningful, the function needs to return something different, something that is worth doing in a slower manner. – Cygnus 2/3, 2022 at 18:44

Let's say I have matrices with an order of 10^6 x 10^6 then with approximate computations, I can reduce the number of computations to less than half. How do we choose which outer products to add is another topic. Then I feel my method will be faster as I have to do half of the multiplication than a full matrix product of this scale. – Spaak 2/3, 2022 at 21:23

To compute your "approximation" (it is not an approximation) vectorized: Ap @ Bp. – Edlin 9/3, 2022 at 14:25

I agree Ap @ Bp seems to be the best way here – Mexicali 9/3, 2022 at 14:32

To vectorize your new requirements: Ap/(len(k)*pd_uniform[k]) @ Bp. Still not an approximation. – Edlin 12/3, 2022 at 7:27

@MichaelSzczesny This is the best answer and it speeds up the code by 20 times. I wish I could accept this as an answer. – Spaak 12/3, 2022 at 12:8

You probable want to put replace=False in your random.choice call. – Archerfish 15/3, 2022 at 22:17

No, this is intentional. it can be done both ways. with replace=False the results will be better but it would cause an extra computation load. @Learningisamess – Spaak 15/3, 2022 at 22:27

Okay, I was not sure of your intention there, seems like you want a 'bootstrap' kind of resampling. Noted. – Archerfish 16/3, 2022 at 4:32

As others have mentioned this could be a good use case for einsum. Writing your operation in that language can be done with

np.einsum( 'ij,ik->jk',A,B)

Repeated i index for the sum, and unrepeated j k for the outer product. A quick benchmark seems to show a 2x speedup compared to @Tomer's proposed answer. This will depend on the input size of course and I leave to you to see how it generalizes to linear sizes in the 10^6 range, the memory footprint should also be better with the einsum.

Archerfish answered 10/3, 2022 at 12:8 Comment(1)

I have updated the question with the exact code. And I need to divide each of the outer products by a number. How would I do that? – Spaak 11/3, 2022 at 11:28

You can try using np.multiply.outer(A,B).

Assume you have the following data:

a = np.array([[4, 2],
              [2, 2]])

b = np.array([[4, 3],
              [4, 0]])

You want to do the following two multiplications:

np.outer(a[:,0],b[0,:]) = array([[16, 12],
                                 [ 8,  6]])
np.outer(a[:,1],b[1,:]) = array([[8, 0],
                                 [8, 0]])

This can be done using the mp.multiply.outer method. The np.multiply.outer "Apply the ufunc op to all pairs (a, b) with a in A and b in B." (See description here). This function perform all possible outer products between A and B, which in this simple example results in a (2,2,2,2) shape matrix. Obviously, you do not need all possible outer product, you just need to extract the one you want from this matrix. You can see that:

np.multiply.outer(a,b)[0,:,0,:] = array([[16, 12],
                                         [ 8,  6]])
np.multiply.outer(a,b)[1,:,1,:] = array([[8, 0],
                                         [8, 0]])

Using this method you do not need to do the for loop, but you execute redundant computations. However, the numpy package is optimized, and perhaps this will be faster (for very large A and B you can try speeding things up using the jit decorator

Another method in which you do not compute redundant computations is via using np.newaxis to expand the matrices before multiplication.

Using the same a, b from above, perform the following:

a[:,:,np.newaxis] = array([[[4],
                            [2]],

                           [[2],
                            [2]]])
b[:,np.newaxis,:] = array([[[4, 3]],

                           [[4, 0]]])

Now you can simply do multiplication to receive:

a[:,:,np.newaxis]*b[:,np.newaxis,:] = array([[[16, 12],
                                              [ 8,  6]],

                                             [[ 8,  0],
                                              [ 8,  0]]])

This is the exact result of the outer products without the redundant computation of np.multiply.outer. All that is left is to sum over the 1st dimension as follows:

results = np.sum(a[:,:,np.newaxis]*b[:,np.newaxis,:], axis=0)

Continuing with the second example, exapnding the example to divide each outer product by a different number can be done as follows:

Assuming the vector pd consists of to numbers (since there fore two outer products), the change needed can now simply be done using::

pd = np.array([[[6]],[[8]]])  # shape is (2,1,1) because there are two outer products
solution = np.sum((a[:,:,np.newaxis] * b[:,np.newaxis,:]) / pd, axis=0)

Another method will be to set pd as a (1,2) shaped array, and divide a prior to multiplication:

pd = np.array([[6,8]])  # shape (1,2) beacuse there are two outer products
solution = np.sum((a / pd)[:,:,np.newaxis]* b[:,np.newaxis,:], axis=0)

Regaurding the Einstein summation proposed in the other solution, you can go with:

pd = np.array([[6,8]])  # shape (1,2) beacuse there are two outer products
solution = np.einsum( 'ij,ik->jk',a/pd,b)

Persistent answered 9/3, 2022 at 16:20 Comment(2)

I have updated the question with the exact code. And I need to divide each of the outer products by a number. How would I do that? – – Spaak 11/3, 2022 at 11:28

I added the change to the answer :) – Persistent 11/3, 2022 at 12:36

Recommended topics

Hot tags