Pandas isna() and isnull(), what is the difference?
Asked Answered
D

3

108

Pandas has both isna() and isnull(). I usually use isnull() to detect missing values and have never met the case so that I had to use other than that. So, when to use isna()?

Darky answered 29/8, 2018 at 21:52 Comment(0)
P
153

isnull is an alias for isna. Literally in the code source of pandas:

isnull = isna

Indeed:

>>> pd.isnull
<function isna at 0x7fb4c5cefc80>

So I would recommend using isna.

Planometer answered 3/10, 2018 at 13:40 Comment(11)
Is there a recommendation which to use? Is one of them just legacy?Deedee
Since isnull is an alias for isna, I would tend to prefer isna. Indeed, isna seems to be used more often than isnull.Planometer
"There should be one—and preferably only one—obvious way to do it."Cinema
Presumably same would apply to notna and notnull?Aristocratic
Not only we should use isna for clarity, but isnull should also be deprecated. isnull which returns False for null values indicates some grass excess during design and quality review.Nyeman
More explanation here: pandas.pydata.org/pandas-docs/stable/reference/api/…Lousy
I'm super annoyed that we have np.nan and np.isnan() but then pandas is pd.isna()Eddie
pd.isnull is not the same as np.isnan; pd.isnull (and isna) work on object dtypes, and return True for None so it makes sense to have a different nameEmplane
numpy.isnan checks for NaN, which is "IEEE 754 floating point representation of Not a Number (NaN)." - paradoxically as it sound, NaN is a number - it is specific IEEE float constant saying "for whatever reason, i am not a regular number, i am Special" - but in a bigger world where there are non-IEEE types like strings or dates, that is still number-ish. isna()/isnull() in contrast disavows any value in that cell, "it's vacuum here"Oralle
one more way of putting it, NaN is "numerical Null" and NA is "universal Null". i guess sometimes it matter, if someone decided to keep them different?Oralle
It's worth adding that "DataFrame.isnull is an alias for DataFrame.isna" and "Series.isnull is an alias for Series.isna", though they're implemented as wrapper functions instead of plain aliases, I suppose so that the documentation can say exactly that they're aliases.Helio
P
10

They both are same. As a best practice, always prefer to use isna() over isnull().

It is easy to remember what isna() is doing because when you look at numpy method np.isnan(), it checks NaN values. In pandas there are other similar method names like dropna(), fillna() that handles missing values and it always helps to remember easily.

Procto answered 15/8, 2019 at 8:42 Comment(1)
pd.isnull()/isna has more functionality and different behaviour than np.isnan; it can also work on non-float datatypes, detect None and so onEmplane
S
8

The documentation for both is literally identical.

pandas.isna()

pandas.isnull()

In here, it even says DataFrame.isnull is an alias of isna in "See also" section.

pandas.DataFrame.isnull()

Therefore, they must be the same thing, like np.nan, np.NaN, np.NAN.

Sg answered 5/9, 2018 at 4:57 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.