How to compare two numpy arrays with some NaN values?
Asked Answered
A

4

5

I need to compare some numpy arrays which should have the same elements in the same order, excepting for some NaN values in the second one.

I need a function more or less like this:

def func( array1, array2 ):
    if ???:
        return True
    else:
        return False

Example:

x = np.array( [ 1, 2, 3, 4, 5 ] )
y = np.array( [ 11, 2, 3, 4, 5 ] )
z = np.array( [ 1, 2, np.nan, 4, 5] )

func( x, z ) # returns True
func( y, z ) # returns False

The arrays have always the same length and the NaN values are always in the third one (x and y have always numbers only). I can imagine there is a function or something already, but I just don't find it.

Any ideas?

Agnesse answered 28/1, 2017 at 20:0 Comment(2)
You do not specify when the two arrays are equal... What if the second has nan. Should these be ignored?Disraeli
@WillemVanOnsem My bad, corrected already ;)Agnesse
I
6

You can use masked arrays, which have the behaviour you're asking for when combined with np.all:

zm = np.ma.masked_where(np.isnan(z), z)

np.all(x == zm) # returns True
np.all(y == zm) # returns False

Or you could just write out your logic explicitly, noting that numpy has to use | instead of or, and the difference in operator precedence that results:

def func(a, b):
    return np.all((a == b) | np.isnan(a) | np.isnan(b))
Ichneumon answered 28/1, 2017 at 20:6 Comment(0)
A
2

You could use isclose to check for equality (or closeness to within a given tolerance -- this is particularly useful when comparing floats) and use isnan to check for NaNs in the second array. Combine the two with bitwise-or (|), and use all to demand every pair is either close or contains a NaN to obtain the desired result:

In [62]: np.isclose(x,z)
Out[62]: array([ True,  True, False,  True,  True], dtype=bool)

In [63]: np.isnan(z)
Out[63]: array([False, False,  True, False, False], dtype=bool)

So you could use:

def func(a, b):
    return (np.isclose(a, b) | np.isnan(b)).all()


In [67]: func(x, z)
Out[67]: True

In [68]: func(y, z)
Out[68]: False
Aedile answered 28/1, 2017 at 20:9 Comment(0)
D
1

What about:

from math import isnan

def fun(array1,array2):
    return all(isnan(x) or isnan(y) or x == y for x,y in zip(array1,array2))

This function works in both directions (if there are NaNs in the first list, these are also ignored). If you do not want that (which is a bit odd since equality usually works bidirectional). You can define:

from math import isnan

def fun(array1,array2):
    return all(isnan(y) or x == y for x,y in zip(array1,array2))

The code works as follows: we use zip to emit tuples of elements of both arrays. Next we check if either the element of the first list is NaN, or the second, or they are equal.

Given you want to write a really elegant function, you better also perform a length check:

from math import isnan

def fun(array1,array2):
    return len(array1) == len(array2) and all(isnan(y) or x == y for x,y in zip(array1,array2))
Disraeli answered 28/1, 2017 at 20:5 Comment(0)
G
1

numpy.islcose() now provides an argument equal_nan for this case!

>>> import numpy as np
>>> np.isclose([1.0, np.nan], [1.0, np.nan])
array([ True, False])
>>> np.isclose([1.0, np.nan], [1.0, np.nan], equal_nan=True)
array([ True,  True])

docs https://numpy.org/doc/stable/reference/generated/numpy.isclose.html

Gyn answered 10/12, 2022 at 20:14 Comment(1)
Unfortunately, this does not address the question, because it is not about comparing NaN values among themselvesAgnesse

© 2022 - 2024 — McMap. All rights reserved.