float('nan')
represents NaN (not a number). But how do I check for it?
Use math.isnan
:
>>> import math
>>> x = float('nan')
>>> math.isnan(x)
True
float("nan")
as it does with numpy.core.numeric.NaN
, while comparing the two with is
does not work. Hence this might be the preferrable solution in (legacy?) code possibly containing both definitions, if I'm not mistaken? –
Mide math.isnan
preferred to np.isnan()
? –
Fogbound import numpy
takes around 15 MB of RAM, whereas import math
takes some 0,2 MB –
Dansby isdigit
is not a check for numbers. For example, '1.0'.isdigit()
produces False
. –
Frap numpy.isnan
is a superior choice, as it handles NumPy arrays. If you're not using NumPy, there's no benefit to taking a NumPy dependency and spending the time to load NumPy just for a NaN check (but if you're writing the kind of code that does NaN checks, it's likely you should be using NumPy). –
Frap float('nan') == float('nan')
returns False
— which is a strange convention, but basically part of the definition of a NaN. The approach you want is actually the one posted by Chris Jester-Young, below. –
Paige math.isnan
seems to be faster than np.isnan
(about 20 times on my machine) –
Timpani numpy.isnan
on a large array is faster than using math.isnan
by a factor of over 200. –
Frap from numpy import isnan
–
Negotiation NaN == NaN
returns False? –
Fayina NaN != x for every x
. Which means you can do x=float('nan'); if x != x: print("it's not a number")
–
Gifford The usual way to test for a NaN is to see if it's equal to itself:
def isNaN(num):
return num != num
math.isnan(x)
requires x to be a real number, incurring the overhead of verifying the type of x (and possibly converting x to a real number) before you can even check for NaN. x != x
is succinct and robust -- bravo! –
Trough np.isnan
and math.isnan
will both break in this case. –
Terzas nan
being the only thing in the universe not equal to itself. AT THE VERY LEAST it should be return isinstance(num, float) and num != num
. The overhead of verifying the type is better than the possibility of actually being wrong, which this can be. –
Yeomanry __eq__
is defined as constant False for some abstract type and of course should have a type check. Otherwise I would say that "some string"
is also Not A Number (and even doesn't have NaN or not-NaN semantics at all). –
Wohlen __eq__
to produce nonsense this function will indeed produce nonsense but that is not the fault of this function. And NaN
refers to certain values in float, a string or anything non-float should never be considered NaN
even though it is "not a number" in some sense. That being said I would still prefer np
or math
for this problem. –
Wheelbase numpy.isnan(number)
tells you if it's NaN
or not.
numpy.all(numpy.isnan(data_list))
is also useful if you need to determine if all elements in the list are nan –
Planimeter all(map(math.isnan, [float("nan")]*5))
–
Toback math
. –
Heterogeneous numpy.isnan
can handle arrays while math.isnan
throws: TypeError: only size-1 arrays can be converted to Python scalars
. –
Hyland Decimal
, you should use d.is_nan()
instead of math.isnan(d)
. Feeding Decimal
instances to math
functions is a bad habit to get into, because most math
functions will convert the input to float and defeat the point of using Decimal
in the first place. –
Frap np.isnan('foo')
causes a TypeError exception because it can't handle strings. Use pd.isna()
instead, if you need to handle strings, and are already using Pandas . That can handle float('nan')
as well as strings. –
Analytic Here are three ways where you can test a variable is "NaN" or not.
import pandas as pd
import numpy as np
import math
# For single variable all three libraries return single boolean
x1 = float("nan")
print(f"It's pd.isna: {pd.isna(x1)}")
print(f"It's np.isnan: {np.isnan(x1)}}")
print(f"It's math.isnan: {math.isnan(x1)}}")
Output:
It's pd.isna: True
It's np.isnan: True
It's math.isnan: True
pd.isnan()
or pd.isna()
? That is the question :D –
Workingwoman if not np.isnan(x):
to be quite useful. –
Venturous pd.isna('foo')
is also the only one that can handle strings. np.isnan('foo')
and math.isnan('foo')
will result in TypeError exception. –
Analytic pd.isna(None)
is also True
. –
Erepsin It seems that checking if it's equal to itself (x != x
) is the fastest.
import pandas as pd
import numpy as np
import math
x = float('nan')
%timeit x != x
44.8 ns ± 0.152 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
%timeit math.isnan(x)
94.2 ns ± 0.955 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
%timeit pd.isna(x)
281 ns ± 5.48 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit np.isnan(x)
1.38 µs ± 15.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
z = float('inf')
, z != z
evaluates to false. –
Ostracon z=float('inf')
and then z==z
give True. x=float('nan')
and then x==x
give False. –
Ketonuria numpy
or another tensor library, anyway. –
Alicaalicante x != x
and math.isnan(x)
disappears; they're both about 35 ns on my system. You can use %timeit
in cell mode to check: 1) %%timeit x = float('nan')
<newline> x != x
2) %%timeit x = float('nan'); from math import isnan
<newline> isnan(x)
–
Erepsin math.isnan
will compete very differently when a function is actually required and x != x
would need wrapping in a lambda
. A numpy
functionality such as numpy.isnan
will compete very differently when applied to a numpy
array where x != x
would require iteration. –
Overmantel here is an answer working with:
- NaN implementations respecting IEEE 754 standard
- ie: python's NaN:
float('nan')
,numpy.nan
...
- ie: python's NaN:
- any other objects: string or whatever (does not raise exceptions if encountered)
A NaN implemented following the standard, is the only value for which the inequality comparison with itself should return True:
def is_nan(x):
return (x != x)
And some examples:
import numpy as np
values = [float('nan'), np.nan, 55, "string", lambda x : x]
for value in values:
print(f"{repr(value):<8} : {is_nan(value)}")
Output:
nan : True
nan : True
55 : False
'string' : False
<function <lambda> at 0x000000000927BF28> : False
numpy.nan
is a regular Python float
object, just like the kind returned by float('nan')
. Most NaNs you encounter in NumPy will not be the numpy.nan
object. –
Frap numpy.nan
defines its NaN value on its own in the underlying library in C. It does not wrap python's NaN. But now, they both comply with IEEE 754 standard as they rely on C99 API. –
Batfish float('nan') is float('nan')
(non-unique) and np.nan is np.nan
(unique) –
Batfish np.nan
is a specific object, while each float('nan')
call produces a new object. If you did nan = float('nan')
, then you'd get nan is nan
too. If you constructed an actual NumPy NaN with something like np.float64('nan')
, then you'd get np.float64('nan') is not np.float64('nan')
too. –
Frap numpy.nan
. Even numpy.array([numpy.nan])[0] is not numpy.nan
. –
Frap I actually just ran into this, but for me it was checking for nan, -inf, or inf. I just used
if float('-inf') < float(num) < float('inf'):
This is true for numbers, false for nan and both inf, and will raise an exception for things like strings or other types (which is probably a good thing). Also this does not require importing any libraries like math or numpy (numpy is so damn big it doubles the size of any compiled application).
math.isfinite
was not introduced until Python 3.2, so given the answer from @Afterward was posted in 2012 it was not exactly "reinvent[ing] the wheel" - solution still stands for those working with Python 2. –
Selfoperating pd.eval
expression. For example pd.eval(float('-inf') < float('nan') < float('inf'))
will return False
–
Oriana or compare the number to itself. NaN is always != NaN, otherwise (e.g. if it is a number) the comparison should succeed.
Well I entered this post, because i've had some issues with the function:
math.isnan()
There are problem when you run this code:
a = "hello"
math.isnan(a)
It raises exception. My solution for that is to make another check:
def is_nan(x):
return isinstance(x, float) and math.isnan(x)
def is_nan(x): try: return math.isnan(x) except: return False
–
Inoculation Another method if you're stuck on <2.6, you don't have numpy, and you don't have IEEE 754 support:
def isNaN(x):
return str(x) == str(1e400*0)
With python < 2.6 I ended up with
def isNaN(x):
return str(float(x)).lower() == 'nan'
This works for me with python 2.5.1 on a Solaris 5.9 box and with python 2.6.5 on Ubuntu 10
-1.#IND
–
Folacin Comparison pd.isna
, math.isnan
and np.isnan
and their flexibility dealing with different type of objects.
The table below shows if the type of object can be checked with the given method:
+------------+-----+---------+------+--------+------+
| Method | NaN | numeric | None | string | list |
+------------+-----+---------+------+--------+------+
| pd.isna | yes | yes | yes | yes | yes |
| math.isnan | yes | yes | no | no | no |
| np.isnan | yes | yes | no | no | yes | <-- # will error on mixed type list
+------------+-----+---------+------+--------+------+
pd.isna
The most flexible method to check for different types of missing values.
None of the answers cover the flexibility of pd.isna
. While math.isnan
and np.isnan
will return True
for NaN
values, you cannot check for different type of objects like None
or strings. Both methods will return an error, so checking a list with mixed types will be cumbersom. This while pd.isna
is flexible and will return the correct boolean for different kind of types:
In [1]: import pandas as pd
In [2]: import numpy as np
In [3]: missing_values = [3, None, np.NaN, pd.NA, pd.NaT, '10']
In [4]: pd.isna(missing_values)
Out[4]: array([False, True, True, True, True, False])
I am receiving the data from a web-service that sends NaN
as a string 'Nan'
. But there could be other sorts of string in my data as well, so a simple float(value)
could throw an exception. I used the following variant of the accepted answer:
def isnan(value):
try:
import math
return math.isnan(float(value))
except:
return False
Requirement:
isnan('hello') == False
isnan('NaN') == True
isnan(100) == False
isnan(float('nan')) = True
try: int(value)
–
Fridafriday value
being NaN
or not? –
Remotion NaN
is (like in python what you could get from float('inf') * 0
), and thus although the string 'Hello' is not a number, but it is also not NaN
because NaN
is still a numeric value! –
Remotion int(value)
For all exception, False
will be written. –
Cid All the methods to tell if the variable is NaN or None:
None type
In [1]: from numpy import math
In [2]: a = None
In [3]: not a
Out[3]: True
In [4]: len(a or ()) == 0
Out[4]: True
In [5]: a == None
Out[5]: True
In [6]: a is None
Out[6]: True
In [7]: a != a
Out[7]: False
In [9]: math.isnan(a)
Traceback (most recent call last):
File "<ipython-input-9-6d4d8c26d370>", line 1, in <module>
math.isnan(a)
TypeError: a float is required
In [10]: len(a) == 0
Traceback (most recent call last):
File "<ipython-input-10-65b72372873e>", line 1, in <module>
len(a) == 0
TypeError: object of type 'NoneType' has no len()
NaN type
In [11]: b = float('nan')
In [12]: b
Out[12]: nan
In [13]: not b
Out[13]: False
In [14]: b != b
Out[14]: True
In [15]: math.isnan(b)
Out[15]: True
How to remove NaN (float) item(s) from a list of mixed data types
If you have mixed types in an iterable, here is a solution that does not use numpy:
from math import isnan
Z = ['a','b', float('NaN'), 'd', float('1.1024')]
[x for x in Z if not (
type(x) == float # let's drop all float values…
and isnan(x) # … but only if they are nan
)]
['a', 'b', 'd', 1.1024]
Short-circuit evaluation means that isnan
will not be called on values that are not of type 'float', as False and (…)
quickly evaluates to False
without having to evaluate the right-hand side.
In Python 3.6 checking on a string value x math.isnan(x) and np.isnan(x) raises an error. So I can't check if the given value is NaN or not if I don't know beforehand it's a number. The following seems to solve this issue
if str(x)=='nan' and type(x)!='str':
print ('NaN')
else:
print ('non NaN')
For nan of type float
>>> import pandas as pd
>>> value = float(nan)
>>> type(value)
>>> <class 'float'>
>>> pd.isnull(value)
True
>>>
>>> value = 'nan'
>>> type(value)
>>> <class 'str'>
>>> pd.isnull(value)
False
If you want to check for values that are not NaN, then negate whatever is used to flag NaNs; pandas has its own dedicated function for flagging non-NaN values.
lst = [1, 2, float('nan')]
m1 = [e == e for e in lst] # [True, True, False]
m2 = [not math.isnan(e) for e in lst] # [True, True, False]
m3 = ~np.isnan(lst) # array([ True, True, False])
m4 = pd.notna(lst) # array([ True, True, False])
This is especially useful if you want to filter values that are not NaN. For ndarray/Series objects, ==
is vectorized, so it can be used as well.
s = pd.Series(lst)
arr = np.array(lst)
x = s[s.notna()]
y = s[s==s] # `==` is vectorized
z = arr[~np.isnan(arr)] # array([1., 2.])
assert (x == y).all() and (x == z).all()
for strings in panda take pd.isnull:
if not pd.isnull(atext):
for word in nltk.word_tokenize(atext):
the function as feature extraction for NLTK
def act_features(atext):
features = {}
if not pd.isnull(atext):
for word in nltk.word_tokenize(atext):
if word not in default_stopwords:
features['cont({})'.format(word.lower())]=True
return features
© 2022 - 2024 — McMap. All rights reserved.
isinstance(float("nan"), Number)
;-P – Kliber