Selecting pandas cells with None value
Asked Answered
C

3

20

I have a column of a pandas dataframe that I got from a database query with blank cells. The blank cells become "None" and I want to check if each of the rows is None:

In [325]: yes_records_sample['name']
Out[325]: 
41055    John J Murphy Professional Building
25260                                   None
41757             Armand Bayou Nature Center
31397                                   None
33104               Hubert Humphrey Building
16891                         Williams Hall
29618                                   None
3770                          Covenant House
39618                                   None
1342       Bhathal Student Services Building
20506                                   None

My understanding per the documentation is that I can check if each row is null with isnull() command http://pandas.pydata.org/pandas-docs/dev/missing_data.html#values-considered-missing

That function, however, is not working for me:

In [332]: isnull(yes_records_sample['name'])

I get the following error:

NameError Traceback (most recent call last)
<ipython-input-332-55873906e7e6> in <module>()
----> 1 isnull(yes_records_sample['name'])
NameError: name 'isnull' is not defined

I also saw that someone just replaced the "None" strings, but neither of these variations on that approach worked for me: Rename "None" value in Pandas

yes_records_sample['name'].replace('None', "--no value--")
yes_records_sample['name'].replace(None, "--no value--")

I was ultimately able to use the fillna function and fill each of those rows with an empty string yes_records_sample.fillna('') as a workaround and then I could check yes_records_sample['name']=='' But I am profoundly confused by how 'None' works and what it means. Is there a way to easily just check if a cell in a dataframe is 'None'?

Connotative answered 12/11, 2014 at 17:57 Comment(0)
O
42

Call it like this:

yes_records_sample['name'].isnull()
Ocieock answered 12/11, 2014 at 17:58 Comment(1)
I've been trying to figure out how to add a column to a pandas data frame that is true if 'impact' == 'HIGH' or 'clin_acc' is not Null. This helped tremendously: nbs_annot['pathogenic'] = (nbs_annot['impact'] == 'HIGH') | ~nbs_annot['clin_acc'].isnull()Presto
U
1

I couldn't find any built-in which does exactly this, so I do it manually. In case of Series, the code is this:

import numpy as np
series = yes_records_sample['name']
n = np.empty_like(series)
n[...] = None
nones = series.values == n

In case of DataFrames, the code is very similar:

import numpy as np
df = yes_records_sample
n = np.empty_like(df)
n[...] = None
nones = df == n

My problem with .isnull() is that it does not distinguish between NaN and None. This may or may not be a problem in your application.

Usher answered 14/11, 2016 at 15:55 Comment(0)
C
1

In case you're checking for None as well as a bunch of other values and want to reuse the same code instead of having a special case for .isnull(), you can use .values in your comparison:

df[df['A'].values == None]
df[df['A'].values == 'foo']  # works just as well for anything else you want to match on
Chlor answered 16/6, 2023 at 20:39 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.