Logical operation on two columns of a dataframe

Asked 27/1, 2016 at 17:10 Answered 24/2, 2022 at 8:6

In pandas, I'd like to create a computed column that's a boolean operation on two other columns.

In pandas, it's easy to add together two numerical columns. I'd like to do something similar with logical operator AND. Here's my first try:

In [1]: d = pandas.DataFrame([{'foo':True, 'bar':True}, {'foo':True, 'bar':False}, {'foo':False, 'bar':False}])

In [2]: d
Out[2]: 
     bar    foo
0   True   True
1  False   True
2  False  False

In [3]: d.bar and d.foo   ## can't
...
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

So I guess logical operators don't work quite the same way as numeric operators in pandas. I tried doing what the error message suggests and using bool():

In [258]: d.bar.bool() and d.foo.bool()  ## spoiler: this doesn't work either
...
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

I found a way that works by casting the boolean columns to int, adding them together and evaluating as a boolean.

In [4]: (d.bar.apply(int) + d.foo.apply(int)) > 0  ## Logical OR
Out[4]: 
0     True
1     True
2    False
dtype: bool

In [5]: (d.bar.apply(int) + d.foo.apply(int)) > 1  ## Logical AND
Out[5]: 
0     True
1    False
2    False
dtype: bool

This is convoluted. Is there a better way?

Leund answered 27/1, 2016 at 17:10 Comment(0)

Yes there is a better way! Just use the & element-wise logical and operator:

d.bar & d.foo

0     True
1    False
2    False
dtype: bool

Lavonna answered 27/1, 2016 at 17:21 Comment(4)

@Leund Yes, there are examples of using & and | in the boolean indexing section – Horton 31/5, 2017 at 12:4

Beware this works strictly with boolean arrays. When dealing with other types like Int32, you get an error. – Weatherly 4/4, 2022 at 18:30

and warning, & has precedence over other operation so in case on more complex operations, don't forget the parenthesis (as me...). Eg : (df.foo==0) & (df.bar==0) – Encourage 22/10, 2022 at 0:41

geeksforgeeks.org/difference-between-and-and-in-python is a useful overview of the difference between and and & for those like me who came here after ValueError: The truth value of a Series is ambiguous. – Matildamatilde 7/11, 2023 at 11:18

Also, there exists another one you could just multiply for AND or add for OR. Without the conversion and extra comparison as you had done.

AND operation:

d.foo * d.bar

OR operation:

d.foo + d.bar

Techy answered 10/11, 2020 at 13:48 Comment(0)

d[(d['bar']) & (d['foo'])]

Hygrograph answered 24/2, 2022 at 8:6 Comment(2)

While this code may solve the question, including an explanation of how and why this solves the problem would really help to improve the quality of your post, and probably result in more up-votes. Remember that you are answering the question for readers in the future, not just the person asking now. Please edit your answer to add explanations and give an indication of what limitations and assumptions apply. – Jeremiahjeremias 24/2, 2022 at 8:13

@Jeremiahjeremias the answer is pretty self explanatory. I liked it – Paunch 25/11, 2022 at 11:29

Recommended topics

Hot tags