making numpy.nanargmin return nan if column is all nan
Asked Answered
R

3

6

Is it possible to use numpy.nanargmin, so that it returns numpy.nan, on columns where there are only nans in them. Right now, it raises a ValueError, when that happens. And i cant use numpy.argmin, since that will fail when there are only a few nans in the column.

http://docs.scipy.org/doc/numpy/reference/generated/numpy.nanargmin.html says that the ValueError is raised for all-nan slices. In that case, i want it to return numpy.nan (just to further mask the "non-data" with nans)

this next bit does this, but is super-slow and not really pythonic:

for i in range(R.shape[0]):
    bestindex = numpy.nanargmin(R[i,:])
    if(numpy.isnan(bestindex)):
        bestepsilons[i]=numpy.nan
    else:
        bestepsilons[i]=epsilon[bestindex]

This next bit works too, but only if no all-nan columns are involved:

ar = numpy.nanargmin(R, axis=1)
bestepsilons = epsilon[ar]

So ideally i would want this last bit to work with all-nan columns as well

Rightward answered 11/3, 2014 at 9:49 Comment(0)
R
3

Found a solution:

# makes everything nan to start with
bestepsilons1 = numpy.zeros(R.shape[0])+numpy.nan 
# finds the indices where the entire column would be nan, so the nanargmin would raise an error
d0 = numpy.nanmin(R, axis=1) 
# on the indices where we do not have a nan-column, get the right index with nanargmin, and than put the right value in those points
bestepsilons1[~numpy.isnan(d0)] = epsilon[numpy.nanargmin(R[~numpy.isnan(d0),:], axis=1)]

This basically is a workaround, by only taking the nanargmin on the places where it will not give an error, since at those places we want the resulting index to be a nan anyways

Rightward answered 13/3, 2014 at 12:54 Comment(0)
I
5
>>> def _nanargmin(arr, axis):
...    try:
...       return np.nanargmin(arr, axis)
...    except ValueError:
...       return np.nan

Demo:

>>> a = np.array([[np.nan]*10, np.ones(10)])
>>> _nanargmin(a, axis=1)
nan
>>> _nanargmin(a, axis=0)
array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

Anyway, it's unlikely to be what you want. Not sure what exactly you are after. If all you want is to filter away the nans, then use boolean indexing:

>>> a[~np.isnan(a)]
array([ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.])
>>> np.argmin(_)
0

EDIT2: Looks like you're after the masked arrays:

>>> a = np.vstack(([np.nan]*10, np.arange(10), np.arange(11, 1, -1)))
>>> a[2, 4] = np.nan
>>> m = np.ma.masked_array(a, np.isnan(a))
>>> np.argmin(m, axis=0)
array([1, 1, 1, 1, 1, 1, 2, 2, 2, 2])
>>> np.argmin(m, axis=1)
array([0, 0, 9])
Ifni answered 11/3, 2014 at 10:22 Comment(3)
then i get "index must be either an int or a sequence", on the line where i do bestepsilons = epsilon[ar]Rightward
Sure, you can't index an array with a nan. What exactly are you trying to achieve?Ifni
getting the best model fit on each point, but if on a certain point, one of the parameters does not exist, all simulated values on that point will return nan as well. After that i get the best simulated value with the argmin. The idea is to do this argmin over the entire image at the same time with 'axis', but that doesnt work if there are points with bad data in them.Rightward
R
3

Found a solution:

# makes everything nan to start with
bestepsilons1 = numpy.zeros(R.shape[0])+numpy.nan 
# finds the indices where the entire column would be nan, so the nanargmin would raise an error
d0 = numpy.nanmin(R, axis=1) 
# on the indices where we do not have a nan-column, get the right index with nanargmin, and than put the right value in those points
bestepsilons1[~numpy.isnan(d0)] = epsilon[numpy.nanargmin(R[~numpy.isnan(d0),:], axis=1)]

This basically is a workaround, by only taking the nanargmin on the places where it will not give an error, since at those places we want the resulting index to be a nan anyways

Rightward answered 13/3, 2014 at 12:54 Comment(0)
A
0

I had a similar problem with an array of shape (nz,ny,nx) with some slices [:,j,i] totally filled with NaNs. In my case I needed argmax index along axis=0 and if I do

np.nanargmax(array,axis=0)

I get "ValueError: All-NaN slice encountered"

Considering that I'm not interested into argmax for slices made all of NaNs, as a workaround I filled the NaNs with zeros

mask = np.isnan(array)
array[mask] = 0
idx2d = np.argmax(array,axis=0)

This gives the indexes of max of array along axis=0. The idx2d array can be re-masked again

idx2d = np.ma.masked_where(mask[0],idx2d)

If you seek for np.argmin you can do the same steps but setting your array to a Huge number instead of 0.

Applewhite answered 28/2 at 14:37 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.