Numpy array broadcasting rules

Asked 24/6, 2012 at 14:16 Answered 24/9, 2021 at 7:15

I'm having some trouble understanding the rules for array broadcasting in Numpy.

Obviously, if you perform element-wise multiplication on two arrays of the same dimensions and shape, everything is fine. Also, if you multiply a multi-dimensional array by a scalar it works. This I understand.

But if you have two N-dimensional arrays of different shapes, it's unclear to me exactly what the broadcasting rules are. This documentation/tutorial explains that:

In order to broadcast, the size of the trailing axes for both arrays in an operation must either be the same size or one of them must be one.

Okay, so I assume by trailing axis they are referring to the N in a M x N array. So, that means if I attempt to multiply two 2D arrays (matrices) with equal number of columns, it should work? Except it doesn't...

>>> from numpy import *
>>> A = array([[1,2],[3,4]])
>>> B = array([[2,3],[4,6],[6,9],[8,12]])
>>> print(A)
[[1 2]
 [3 4]]
>>> print(B)
[[ 2  3]
 [ 4  6]
 [ 6  9]
 [ 8 12]]
>>> 
>>> A * B
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: shape mismatch: objects cannot be broadcast to a single shape

Since both A and B have two columns, I would have thought this would work. So, I'm probably misunderstanding something here about the term "trailing axis", and how it applies to N-dimensional arrays.

Can someone explain why my example doesn't work, and what is meant by "trailing axis"?

Cornered answered 24/6, 2012 at 14:16 Comment(1)

why is it named broadcasting? It is so confusing just like the name "imaginary numbers" – Ewaewald 7/6, 2017 at 7:33

Well, the meaning of trailing axes is explained on the linked documentation page. If you have two arrays with different dimensions number, say one 1x2x3 and other 2x3, then you compare only the trailing common dimensions, in this case 2x3. But if both your arrays are two-dimensional, then their corresponding sizes have to be either equal or one of them has to be 1. Dimensions along which the array has size 1 are called singular, and the array can be broadcasted along them.

In your case you have a 2x2 and 4x2 and 4 != 2 and neither 4 or 2 equals 1, so this doesn't work.

Strung answered 24/6, 2012 at 14:25 Comment(3)

In other words, the shape of A should be a suffix of the shape of B, disregarding any axis that value 1 (?) – Pucker 24/6, 2012 at 14:26

if by disregarding you mean '1 equals anything' and either shape(A) or shape(B) can be suffixes of one another, then yes. – Strung 24/6, 2012 at 14:28

actually, you can look at any array as being infinitely-dimensional of size ...x1x1x1x1x1x1x1x.....xAxBxC so we have a lot of leading 1s, which can be broadcasted as other ones. This way you can forget that suffix stuff, just say 1 equals anything. – Strung 24/6, 2012 at 14:30

From Stanford CS231n's Numpy Tutorial:

Broadcasting two arrays together follows these rules:

If the arrays do not have the same rank, prepend the shape of the lower rank array with 1s until both shapes have the same length.

The two arrays are said to be compatible in a dimension if they have the same size in the dimension, or if one of the arrays has size 1 in that dimension.

The arrays can be broadcast together if they are compatible in all dimensions.

After broadcasting, each array behaves as if it had shape equal to the elementwise maximum of shapes of the two input arrays.

In any dimension where one array had size 1 and the other array had size greater than 1, the first array behaves as if it were copied along that dimension

If this explanation does not make sense, try reading the explanation from the documentation or this explanation.

Skurnik answered 10/1, 2019 at 21:9 Comment(1)

I still can't get the following. Lets say: x = np.random.randn(3,)and W = np.random.randn(4, 3, 3). Why W @ x != W @ np.broadcast_to(x, W.shape)? According to the rule 4 they shouldn't be different. – Richart 25/6 at 11:3

we should consider two points about broadcasting. first: what is possible. second: how much of the possible things is done by numpy.

I know it might look a bit confusing, but I will make it clear by some example.

lets start from the zero level.

suppose we have two matrices. first matrix has three dimensions (named A) and the second has five (named B). numpy tries to match last/trailing dimensions. so numpy does not care about the first two dimensions of B. then numpy compares those trailing dimensions with each other. and if and only if they be equal or one of them be 1, numpy says "O.K. you two match". and if it these conditions don't satisfy, numpy would "sorry...its not my job!".

But I know that you may say comparison was better to be done in way that can handle when they are devisable(4 and 2 / 9 and 3). you might say it could be replicated/broadcasted by a whole number(2/3 in out example). and i am agree with you. and this is the reason I started my discussion with a distinction between what is possible and what is the capability of numpy.

Strasbourg answered 24/9, 2021 at 7:15 Comment(0)

Recommended topics

Hot tags