What is going on behind this numpy selection behavior?
Asked Answered
M

1

6

Answering this question, some others and I were actually wrong by considering that the following would work:

Say one has

test = [ [ [0], 1 ],
         [ [1], 1 ]
       ]
import numpy as np
nptest = np.array(test)

What is the reason behind

>>> nptest[:,0]==[1]
array([False, False], dtype=bool)

while one has

>>> nptest[0,0]==[1],nptest[1,0]==[1]
(False, True)


or
>>> nptest==[1]
array([[False,  True],
       [False,  True]], dtype=bool)

or

>>> nptest==1
array([[False,  True],
       [False,  True]], dtype=bool)

Is it the degeneracy in term of dimensions which causes this.

Moray answered 31/7, 2017 at 23:7 Comment(2)
This is not a kind of interaction anyone cared to make easy when designing NumPy. NumPy is designed for rigid multidimensional grids of numbers. Trying to get anything but a rigid multidimensional grid is going to be painful.Pratt
Moral of the story: Don't use dtype=object arrays. They are stunted Python lists, with worse performance characteristics, and numpy is not designed to handle the case of sequence-like containers within these object arrays.Mandler
P
3

nptest is a 2D array of object dtype, and the first element of each row is a list.

nptest[:, 0] is a 1D array of object dtype, each of whose elements are lists.

When you do nptest[:,0]==[1], NumPy does not perform an elementwise comparison of each element of nptest[:,0] against the list [1]. It creates as high-dimensional an array as it can from [1], producing the 1D array np.array([1]), and then broadcasts the comparison, comparing each element of nptest[:,0] against the integer 1.

Since no list in nptest[:, 0] is equal to 1, all elements of the result are False.

Pratt answered 31/7, 2017 at 23:14 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.