Python - Pandas - DataFrame - Explode single column into multiple boolean columns based on conditions
Asked Answered
S

1

6

Good morning chaps,

Any pythonic way to explode a dataframe column into multiple columns with boolean flags, based on some condition (str.contains in this case)?

Let's say I have this:

Position Letter 
1        a      
2        b      
3        c      
4        b      
5        b

And I'd like to achieve this:

Position Letter is_a     is_b    is_C
1        a      TRUE     FALSE   FALSE
2        b      FALSE    TRUE    FALSE
3        c      FALSE    FALSE   TRUE
4        b      FALSE    TRUE    FALSE
5        b      FALSE    TRUE    FALSE 

Can do with a loop through 'abc' and explicitly creating new df columns, but wondering if some built-in method already exists in pandas. Number of possible values, and hence number of new columns is variable.

Thanks and regards.

Stansberry answered 15/11, 2017 at 12:42 Comment(2)
Can you show us a minimal example of what you have tried so far? Please have a look here: stackoverflow.com/help/how-to-askCoccidiosis
for lt in Letter: df[lt] = df.Letter.str.contains(lt)Stansberry
E
8

use Series.str.get_dummies():

In [31]: df.join(df.Letter.str.get_dummies())
Out[31]:
   Position Letter  a  b  c
0         1      a  1  0  0
1         2      b  0  1  0
2         3      c  0  0  1
3         4      b  0  1  0
4         5      b  0  1  0

or

In [32]: df.join(df.Letter.str.get_dummies().astype(bool))
Out[32]:
   Position Letter      a      b      c
0         1      a   True  False  False
1         2      b  False   True  False
2         3      c  False  False   True
3         4      b  False   True  False
4         5      b  False   True  False
Expropriate answered 15/11, 2017 at 13:7 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.