python list + empty numpy array = empty numpy array?

Today I noticed something odd in my code, and discovered that in certain situations it run down to the execution of the following:

my_list = [0] + np.array([])

which results in my_list being the following:

array([], dtype=float64)

At the beginning I was quite confused, than I understood that the interpreter is first converting the list to a numpy array, and then trying a broadcasting operation:

>>> np.array([0]) + np.array([])
array([], dtype=float64)

I have some questions about this behaviour:

Why is it broadcasting?
Wouldn't it be better if python threw an error, at least for this particular case where a list is converted and made disappear?

Thank you for your clarifications!

First of all:

Wouldn't it be better if python threw an error, at least for this particular case where a list is converted and made disappear?

I don't think that test is possible. According to this comment:

For reversed operations like b.__radd__(a) we call the corresponding ufunc.

This means that using [0] + np.array([]) will actually call the ufunc np.add([0], np.array([])), which converts array-like lists to arrays without having a chance to decide about the size of the operands.

So broadcasting is a given. The question is then whether it's sane to have shapes (1,) and (0,) broadcast to (0,). You can think about it this way: scalars always broadcast, and 1-element 1d arrays are as good as scalars in most situations:

>>> np.add([0], [])
array([], dtype=float64)

>>> np.add(0, [])
array([], dtype=float64)

If you look at it this way it makes sense as a rule, even though I agree it's surprising, especially that non-one-length arrays won't broadcast like this. But it's not a bug (just an interesting situation for a feature).

To be more precise, what is happening with broadcasting is always that "dimensions with size 1 will broadcast". The array-like [0] has shape (1,), and the np.array([]) has shape (0,) (as opposed to a scalar np.int64() which would have shape ()!). So broadcasting happens on the singleton, and the result has shape (0,).

It gets clearer if we inject more singleton dimensions:

>>> ([0] + np.array([])).shape
(0,)

>>> ([[0]] + np.array([])).shape
(1, 0)

>>> ([[[0]]] + np.array([])).shape
(1, 1, 0)

>>> np.shape([[[0]]])
(1, 1, 1)

So for instance in the last case shapes (1, 1, 1) will nicely broadcast with a 1d array along its last dimension, and the result should indeed be (1, 1, 0) in this case.

Recommended topics

Hot tags