How to check if particular value (in cell) is NaN in pandas DataFrame?
Asked Answered
L

5

101

Let's say I have the following pandas DataFrame:

import pandas as pd
import numpy as np
df = pd.DataFrame({"A": [1, np.nan, 2], "B": [5, 6, 0]})

Which would look like:

>>> df
     A  B
0  1.0  5
1  NaN  6
2  2.0  0

First option

I know one way to check if a particular value is NaN:

>>> df.isnull().iloc[1,0]
True

But this checks the whole dataframe just to get one value, so I imagine it's wasteful.

Second option (not working)

I thought below option, using iloc, would work as well, but it doesn't:

>>> df.iloc[1,0] == np.nan
False

However if I check that value I get:

>>> df.iloc[1,0]
nan

So, why is the second option not working? Is it possible to check for NaN values using iloc?


Editor's note: This question previously used pd.np instead of np and .ix in addition to .iloc, but since these no longer exist, they have been edited out to keep it short and clear.

Larrup answered 22/11, 2017 at 16:53 Comment(15)
Explanation: try this: pd.np.nan == pd.np.nan ;)Kilohertz
That gives False! Why is that?Larrup
#20320522Irina
That's the nature of "Not A Number". Because of that we have pd.isnull(), pd.notnull(), IS (NOT) NULL in SQL, etcKilohertz
@ayhan, how do you think - should we close it as a dupe?Kilohertz
Aha, bingo. Hence using is would be a way to use ix or iloc?Larrup
@MaxU If the OP thinks that resolves it, sure.Irina
@CedricZoppolo, you better use pd.isnull() - it's a vectorized solution.Kilohertz
@CedricZoppolo, does #20320522 answer your question?Kilohertz
Well, seems it doesn't. Although pd.np.nan is pd.np.nan resolves to True, df.iloc[0,1] is pd.np.nan still resolves to FalseLarrup
@CedricZoppolo, another hint: compare type(pd.np.nan) and type(df.iloc[0,1]). Don't use is for such checksKilohertz
@MaxU, well, it answers a part of it. Actually after looking at that question I thouhgt df.iloc[0,1] is pd.np.nan would resolve to True but it's not. That question doesn't answer my second question. "Is it possible to check for NaN values using ix or iloc?"Larrup
Hence the answer for my second question is it's not possible? I know using isnull is the best and it will be the one I will use. But I still wonder why I cant use the second option (using ix or iloc) some way...Larrup
stop using .ix. STOP USING .ix!!!!Edee
@TedPetrou, I guess in my case I can't as I'm stuck with python 2.5 within the project I'm working with, due to using an external API dependent on python 2.5.Larrup
K
163

Try pd.isna():

In [7]: pd.isna(df.iloc[1,0])
Out[7]: True

AKA pd.isnull

Kilohertz answered 22/11, 2017 at 17:6 Comment(6)
Great. That answers the second part of the question. I guess another way would be pd.isnull(df).iloc[1][0]Larrup
@CedricZoppolo, I like your original version (in the question) - df.isnull().ix[1,0] betterKilohertz
@Cedric also np.isnan(df.iloc[1,0]) to check if a number is nan.Underestimate
And I would add df.iloc[1,0] is pd.np.nan resolves to False as the types are not the same. type(df.iloc[1,0]) resolves to <type 'numpy.float64'> and type(pd.np.nan) resolves to <type 'float'>, as suggested by @MaxU in question's comment, to check why this happens.Larrup
@CedricZoppolo and MaxU Isn't pd.isnull(df).iloc[1][0] far less efficient?Stopgap
Opposite of pd.isna() is pd.notna()Dove
R
14

The above answer is excellent. Here is the same with an example for better understanding.

>>> import pandas as pd
>>> import numpy as np
>>> s = pd.Series([np.nan, 34, 56])
>>> s
0     NaN
1    34.0
2    56.0
dtype: float64
>>> pd.isnull(s[0])
True

I also tried couple of times, the following trials did not work. Thanks to @MaxU.

>>> s[0]
nan
>>> s[0] == np.nan
False
>>> s[0] is np.nan
False
>>> s[0] == 'nan'
False
Randeerandel answered 9/12, 2018 at 17:29 Comment(0)
H
12

pd.isna(cell_value) can be used to check if a given cell value is nan. Alternatively, pd.notna(cell_value) to check the opposite.

From source code of pandas:

def isna(obj):
    """
    Detect missing values for an array-like object.

    This function takes a scalar or array-like object and indicates
    whether values are missing (``NaN`` in numeric arrays, ``None`` or ``NaN``
    in object arrays, ``NaT`` in datetimelike).

    Parameters
    ----------
    obj : scalar or array-like
        Object to check for null or missing values.

    Returns
    -------
    bool or array-like of bool
        For scalar input, returns a scalar boolean.
        For array input, returns an array of boolean indicating whether each
        corresponding element is missing.

    See Also
    --------
    notna : Boolean inverse of pandas.isna.
    Series.isna : Detect missing values in a Series.
    DataFrame.isna : Detect missing values in a DataFrame.
    Index.isna : Detect missing values in an Index.

    Examples
    --------
    Scalar arguments (including strings) result in a scalar boolean.

    >>> pd.isna('dog')
    False

    >>> pd.isna(np.nan)
    True
Hal answered 30/5, 2019 at 20:0 Comment(2)
Both functions pd.isnull and pd.isna do exactly the same. And actually pd.isnull is an alias of pd.isna as stated in this post.Larrup
Agree, for beginners, isna will be more readable while working with pandas.Hal
F
-1

I made up some workaround:

x = [np.nan]

In [4]: x[0] == np.nan
Out[4]: False

but:

In [5]: np.nan in x
Out[5]: True

You can see list contain method implementation, to understand why it works.

Faircloth answered 18/3, 2022 at 9:15 Comment(0)
T
-1
df.isnull().loc[1,0]

I tried the above syntax and it worked.

Triazine answered 8/2, 2023 at 4:35 Comment(1)
That might work for your case, but in OP's case, .loc[1,0] raises KeyError: 0. Maybe you meant .iloc instead, but then, doing df.isnull() on the whole dataframe is wasteful when you just want one value. I just updated the question to say that btw.Demonstrate

© 2022 - 2024 — McMap. All rights reserved.