Subtract each row of matrix A from every row of matrix B without loops
Asked Answered
M

2

6

Given two arrays, A (shape: M X C) and B (shape: N X C), is there a way to subtract each row of A from each row of B without using loops? The final output would be of shape (M N X C).


Example

A = np.array([[  1,   2,   3], 
              [100, 200, 300]])

B = np.array([[  10,   20,   30],
              [1000, 2000, 3000],
              [ -10,  -20,   -2]])

Desired result (can have some other shape) (edited):

array([[  -9,   -18,   -27],
       [-999, -1998, -2997],
       [  11,    22,     5],
       [  90,   180,   270],
       [-900, -1800, -2700],
       [ 110,   220,   302]])

Shape: 6 X 3

(Loop is too slow, and "outer" subtracts each element instead of each row)

Murton answered 20/1, 2018 at 17:27 Comment(2)
I edited the question to show the desired resultMurton
The given desired output does not match the description of the desired output.Evermore
P
9

It's possible to do it efficiently (without using any loops) by leveraging broadcasting like:

In [28]: (A[:, np.newaxis] - B).reshape(-1, A.shape[1])
Out[28]: 
array([[   -9,   -18,   -27],
       [ -999, -1998, -2997],
       [   11,    22,     5],
       [   90,   180,   270],
       [ -900, -1800, -2700],
       [  110,   220,   302]])

Or, for a little faster solution than broadcasting, we would have to use numexpr like:

In [31]: A_3D = A[:, np.newaxis]
In [32]: import numexpr as ne

# pass the expression for subtraction as a string to `evaluate` function
In [33]: ne.evaluate('A_3D - B').reshape(-1, A.shape[1])
Out[33]: 
array([[   -9,   -18,   -27],
       [ -999, -1998, -2997],
       [   11,    22,     5],
       [   90,   180,   270],
       [ -900, -1800, -2700],
       [  110,   220,   302]], dtype=int64)

One more least efficient approach would be by using np.repeat and np.tile to match the shapes of both arrays. But, note that this is the least efficient option because it makes copies when trying to match the shapes.

In [27]: np.repeat(A, B.shape[0], 0) - np.tile(B, (A.shape[0], 1))
Out[27]: 
array([[   -9,   -18,   -27],
       [ -999, -1998, -2997],
       [   11,    22,     5],
       [   90,   180,   270],
       [ -900, -1800, -2700],
       [  110,   220,   302]])
Punic answered 20/1, 2018 at 17:34 Comment(5)
I'd reverse the order of your suggestions -- broadcasting here is the canonical way. Manual repeats and tiles should be reserved for when they're needed.Gentleman
Thank you. I definitely prefer the broadcasting method.Murton
Thank you. I was not familiar with numexpr before. Will try that too.Murton
@Punic this create problem for large matrices. I have matrix A (180000,3) and matrix B (50000,3) .. all doubles. When I tried to calculate A[:, np.newaxis] - B I get MemoryError. Any recommendation to help solve this problem please? Looping takes 1100 seconds.Aparri
@MubeenShahid Yes, there might be a problem with such huge array sizes. With the sizes you mentioned, one would need at least 101 GB of RAM (i.e. (180000* 50000* 3 * 4)/1024/1024/1024), assuming no other processes are using any RAM at all, which is of course not possible. To circumvent this issue, numpy.memmap might help but I haven't done that myself. You can also post a new question for gathering more opinions from others...Punic
E
1

Using the Kronecker product (numpy.kron):

>>> import numpy as np
>>> A = np.array([[  1,   2,   3], 
...               [100, 200, 300]])
>>> B = np.array([[  10,   20,   30],
...               [1000, 2000, 3000],
...               [ -10,  -20,   -2]])
>>> (m,c) = A.shape
>>> (n,c) = B.shape
>>> np.kron(A,np.ones((n,1))) - np.kron(np.ones((m,1)),B)
array([[   -9.,   -18.,   -27.],
       [ -999., -1998., -2997.],
       [   11.,    22.,     5.],
       [   90.,   180.,   270.],
       [ -900., -1800., -2700.],
       [  110.,   220.,   302.]])
Evermore answered 21/1, 2018 at 22:12 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.