How to perform element-wise Boolean operations on NumPy arrays [duplicate]
Asked Answered
P

4

89

For example, I would like to create a mask that masks elements with value between 40 and 60:

foo = np.asanyarray(range(100))
mask = (foo < 40).__or__(foo > 60)

Which just looks ugly. I can't write

(foo < 40) or (foo > 60)

because I end up with:

  ValueError Traceback (most recent call last)
  ...
  ----> 1 (foo < 40) or (foo > 60)
  ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Is there a canonical way of doing element-wise Boolean operations on NumPy arrays with good looking code?

Prostatectomy answered 25/12, 2011 at 23:3 Comment(1)
I'm not really convinced this is a duplicate. The other question is primarily about pandas.Villain
B
120

Try this:

mask = (foo < 40) | (foo > 60)

Note: the __or__ method in an object overloads the bitwise or operator (|), not the Boolean or operator.

Boyes answered 25/12, 2011 at 23:6 Comment(8)
it doesn't work: TypeError: ufunc 'bitwise_or' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''Prissie
Don't forget to properly bracket your expressionsThan
Hmm...why are the parentheses required?Neuter
@Neuter The reason is that bitwise or has higher precedence than the comparison operators (that doesn't happen with the boolean or). For more information, please have a look at the documentationBoyes
@Boyes Thanks. It appears | is overloaded by NumPy to work with boolean NumPy arrays (since it doesn't act a strict "bitwise" operator anymore). Strangely, I was not able to find this documented in NumPy's documentation: docs.scipy.org/doc/numpy-1.13.0/reference/generated/… Or is it? What I'm trying to get at is...is it "officially" supported syntax?Neuter
@Neuter I think the documentation that you were looking for is here: <<This ufunc implements the C/Python operator |>>.Boyes
@Boyes Great! Looks like | not only works element-wise, it will operate bitwise on those elements - which makes a difference for integer arrays, but not boolean arrays, in comparison to np.logical_or(.). I like the | syntax more, but the requirement of parentheses in Python (a source of hard-to-find bugs) still makes me a little envious of MATLAB users...Neuter
Please be very cautions with this syntax in larger data sets... try the other solution below with np.any / np.all to avoid hard-to trace bugs in corner cases.Heracles
C
30

You can use the NumPy logical operations. In your example:

np.logical_or(foo < 40, foo > 60)
Consol answered 28/7, 2017 at 18:34 Comment(0)
G
26

If you have comparisons within only Booleans, as in your example, you can use the bitwise OR operator | as suggested by Jcollado. But beware, this can give you strange results if you ever use non-Booleans, such as mask = (foo < 40) | override. Only as long as override guaranteed to be either False, True, 1, or 0, are you fine.

More general is the use of NumPy's comparison set operators, np.any and np.all. This snippet returns all values between 35 and 45 which are less than 40 or not a multiple of 3:

import numpy as np
foo = np.arange(35, 46)
mask = np.any([(foo < 40), (foo % 3)], axis=0)
print foo[mask]
OUTPUT: array([35, 36, 37, 38, 39, 40, 41, 43, 44])

It is not as nice as with |, but nicer than the code in your question.

Greenlee answered 28/6, 2012 at 14:31 Comment(3)
It's a good idea to use np.any and np.all specifically.Barty
If you are thinking about other solutions, do not. It will save you a lot of misery!Heracles
Note that with np.any / np.all commas are sufficient to separate array conditions (no parentheses () required inside the brackets [] ), and you can have more than two conditions (evaluated element-wise as long as axis=0 is set), provided they are of the same type (e.g. all conjunctions or all alternatives, but not a mix of conjunctions and alternatives)Heracles
C
9

Note that you can use ~ for elementwise negation.

arr = np.array([False, True])
~arr

OUTPUT: array([ True, False], dtype=bool)

Also & does elementwise and

arr_1 = np.array([False, False, True, True])
arr_2 = np.array([False, True, False, True])

arr_1 & arr_2

OUTPUT:   array([False, False, False,  True], dtype=bool)

These also work with Pandas Series

ser_1 = pd.Series([False, False, True, True])
ser_2 = pd.Series([False, True, False, True])

ser_1 & ser_2

OUTPUT:
0    False
1    False
2    False
3     True
dtype: bool
Constrain answered 30/9, 2016 at 11:59 Comment(2)
According to the numpy documentation, it seems like & does bitwise and, not elementwise.Thelma
"The & operator can be used as a shorthand for np.logical_and on boolean ndarrays." numpy.org/doc/stable/reference/generated/…Terbia

© 2022 - 2024 — McMap. All rights reserved.