How do I get indices of N maximum values in a NumPy array?

Asked 2/8, 2011 at 10:29 Answered 9/11, 2022 at 16:26

817

NumPy proposes a way to get the index of the maximum value of an array via np.argmax.

I would like a similar thing, but returning the indexes of the N maximum values.

For instance, if I have an array, [1, 3, 2, 4, 5], then nargmax(array, n=3) would return the indices [4, 3, 1] which correspond to the elements [5, 4, 3].

Dumah answered 2/8, 2011 at 10:29 Comment(4)

possible duplicate of python+numpy: efficient way to take the min/max n values and indices from a matrix – Exploratory 2/8, 2011 at 11:3

Your question is not really well defined. For example, what would the indices (you expect) to be for array([5, 1, 5, 5, 2, 3, 2, 4, 1, 5]), whit n= 3? Which one of all the alternatives, like [0, 2, 3], [0, 2, 9], ... would be the correct one? Please elaborate more on your specific requirements. Thanks – Falter 2/8, 2011 at 17:2

@eat, I don't really care about which one is supposed to be returned in this specific case. Even if it seem logical to return the first one encountered, that's not a requirement for me. – Persaud 3/8, 2011 at 16:46

argsort might be a viable alternative if you do not care about the order of the returned indeces. See my answer below. – Quod 13/5, 2016 at 14:24

1000

Newer NumPy versions (1.8 and up) have a function called argpartition for this. To get the indices of the four largest elements, do

>>> a = np.array([9, 4, 4, 3, 3, 9, 0, 4, 6, 0])
>>> a
array([9, 4, 4, 3, 3, 9, 0, 4, 6, 0])

>>> ind = np.argpartition(a, -4)[-4:]
>>> ind
array([1, 5, 8, 0])

>>> top4 = a[ind]
>>> top4
array([4, 9, 6, 9])

Unlike argsort, this function runs in linear time in the worst case, but the returned indices are not sorted, as can be seen from the result of evaluating a[ind]. If you need that too, sort them afterwards:

>>> ind[np.argsort(a[ind])]
array([1, 8, 5, 0])

To get the top-k elements in sorted order in this way takes O(n + k log k) time.

Hive answered 2/8, 2011 at 10:29 Comment(16)

Actually it has to be O(n lg k) time. Cannot imagine how O(n + k lg k) can be – Blizzard 25/11, 2014 at 14:12

@Blizzard argpartition runs in linear time, O(n), using the introselect algorithm. The subsequent sort only handles k elements, so that runs in O(k log k). – Hive 26/11, 2014 at 15:52

If anybody is wondering how exactly np.argpartition and its sister algorithm np.partition work there is a more detailed explanation in the linked question: #10338033 – Mews 30/3, 2016 at 14:49

@FredFoo: why did you use -4? did you do that to start backward?(since k being positive or negative works the same for me! it only prints the smallest numbers first! – Frick 19/11, 2016 at 10:27

Running: {import numpy as np a = [9, 4, 4, 3, 3, 9, 0, 4, 6, 0] ind = np.argpartition(a, -4)[-4:] a[ind]} now throws this error.

Traceback (most recent call last):   File "<stdin>", line 1, in <module> TypeError: only integer scalar arrays can be converted to a scalar index

– Syllabus 8/8, 2017 at 18:5

@Syllabus use a=np.array([9, 4, 4, 3, 3, 9, 0, 4, 6, 0]) because normal python lists do not support indexing by lists, unlike np.array – Tourniquet 8/8, 2017 at 19:34

how to use this method to find the indices of top n values of each row of a matrix? – Cloninger 2/2, 2018 at 12:8

@Umangsinghal np.argpartition takes an optional axis argument. To find the indices of the top n values for each row: np.argpartition(a, -n, axis=1)[-n:] – Apophthegm 6/6, 2019 at 12:6

@Umangsinghal For the top n values of each row, making sure the results are sorted at the end is more complicated. n should be a negative range, and you need to properly flip the order at the end as well: np.argpartition(matrix, range(-n, 0), axis=1)[:, ::-1] For some reason it's taken me many months to get to a good answer on this. – Ethyne 20/8, 2019 at 7:51

Be carefull with the sort trick though if you need a stable sort over the whole array: if the biggest k-th value is equal to other values in the initial array, then you have no guarantee that these values will appear in the same order in the final result because some of these "equal values" might be included in the output and some others not. – Shaver 1/9, 2019 at 13:38

@JoshuaM And still incorrect. Correct way would be np.argpartition(matrix, range(-n, 0), axis=1)[:, :-(n+1):-1] – Nature 10/8, 2020 at 18:32

@Apophthegm Don't you mean np.argpartition(a, -n, axis=1)[:, -n:]? – Parvati 9/2, 2021 at 15:37

If I want to get the smallest k and biggest k values of the array at the same time, is there any better way than simply applying this solution twice? – Alcaide 27/8, 2021 at 14:30

@Frick -4 vs 4 does indeed matter for np.argpartition(a, -4). In this example it happens to give the same result, while -2 and 2 give different results. – Burgeon 11/6, 2022 at 21:0

It's worth mentioning that the kth argument of np.argpartition() refers to the kth element of the resulted array, not the primary array. That's why when we want the 4 largest elements, we give -4 to np.argpartition() or np.partition(). It causes the last 4 elements of the resulted array to be the 4 largest elements. More: https://mcmap.net/q/55122/-how-does-np-partition-interpret-the-argument-kth/4526384 – Keil 3/9, 2022 at 12:13

could it be O(nk *log(k))? It seems to me that it cannot be smaller than O(nk) that's the number of outputs – Kilbride 4/10, 2022 at 13:10

514

The simplest I've been able to come up with is:

>>> import numpy as np
>>> arr = np.array([1, 3, 2, 4, 5])
>>> arr.argsort()[-3:][::-1]
array([4, 3, 1])

This involves a complete sort of the array. I wonder if numpy provides a built-in way to do a partial sort; so far I haven't been able to find one.

If this solution turns out to be too slow (especially for small n), it may be worth looking at coding something up in Cython.

Restoration answered 2/8, 2011 at 10:32 Comment(10)

Could line 3 be written equivalently as arr.argsort()[-1:-4:-1]? I've tried it in interpreter and it comes up with the same result, but I'm wondering if it's not broken by some example. – Gillies 20/9, 2012 at 9:5

@Gillies Yes that should be equivalent for any list or array. Alternatively, this could be done without the reversal by using np.argsort(-arr)[:3], which I find more readable and to the point. – Lizbeth 29/5, 2013 at 19:48

what does [::-1] mean? @Restoration – Workingwoman 17/10, 2016 at 5:29

@Workingwoman it means reverse an array (literally, takes a copy of an array from unconstrained min to unconstrained max in a reversed order) – Shit 19/10, 2016 at 13:51

@Shit so the two : does not related to dimensions right? the whole expression is used for create a reversed array? Am I understanding correctly? – Workingwoman 19/10, 2016 at 19:5

arr.argsort()[::-1][:n] is better because it returns empty for n=0 instead of the full array – Hypotrachelium 8/9, 2017 at 1:34

@Restoration numpy has the function argpartition which will isolate the top K elements from the rest without doing a full sort, and then the sorting can be done only on those K. – Sileas 12/7, 2019 at 12:44

It would be better if this was written in a general way and not specific to his array, which makes it unintelligible. – Embryotomy 15/7, 2019 at 21:57

@Lizbeth Index reversal is much faster than inversion, though – Parvati 26/1, 2020 at 3:41

@Workingwoman arr.argsort()[-3:] >>>array([1, 3, 4], dtype=int64) arr.argsort()[-3:][::-1] >>>array([4, 3, 1], dtype=int64) – Occultism 25/2, 2022 at 21:50

Simpler yet:

idx = (-arr).argsort()[:n]

where n is the number of maximum values.

Dori answered 11/12, 2014 at 22:13 Comment(4)

Can this be done for a 2d array? If not, do you perhaps know how? – Cheng 23/9, 2015 at 2:17

@AndrewHundt : simply use (-arr).argsort(axis=-1)[:, :n] – Sarge 29/12, 2018 at 3:44

similar would be arr[arr.argsort()[-n:]] instead of negating the array, just take a slice of the last n elements – Vulgarity 12/3, 2019 at 3:50

ind = np.argsort(-arr,axis=0)[:4] worked for me to find out first 4 index coloum wise – Brenn 20/5, 2021 at 16:54

Use:

>>> import heapq
>>> import numpy
>>> a = numpy.array([1, 3, 2, 4, 5])
>>> heapq.nlargest(3, range(len(a)), a.take)
[4, 3, 1]

For regular Python lists:

>>> a = [1, 3, 2, 4, 5]
>>> heapq.nlargest(3, range(len(a)), a.__getitem__)
[4, 3, 1]

If you use Python 2, use xrange instead of range.

Source: heapq — Heap queue algorithm

Bulger answered 9/9, 2013 at 5:30 Comment(2)

There's no need of a loop at all here: heapq.nlargest(3, xrange(len(a)), a.take). For Python lists we can use .__getitem__ instead of .take. – Addington 28/10, 2014 at 9:9

For n-dimensional arrays A in general: heapq.nlargest(3, range(len(A.ravel())), A.ravel().take). (I hope this only operates on views, see also (ravel vs flatten](https://mcmap.net/q/22670/-what-is-the-difference-between-flatten-and-ravel-functions-in-numpy)). – Marconigraph 10/11, 2017 at 17:57

If you happen to be working with a multidimensional array then you'll need to flatten and unravel the indices:

def largest_indices(ary, n):
    """Returns the n largest indices from a numpy array."""
    flat = ary.flatten()
    indices = np.argpartition(flat, -n)[-n:]
    indices = indices[np.argsort(-flat[indices])]
    return np.unravel_index(indices, ary.shape)

For example:

>>> xs = np.sin(np.arange(9)).reshape((3, 3))
>>> xs
array([[ 0.        ,  0.84147098,  0.90929743],
       [ 0.14112001, -0.7568025 , -0.95892427],
       [-0.2794155 ,  0.6569866 ,  0.98935825]])
>>> largest_indices(xs, 3)
(array([2, 0, 0]), array([2, 2, 1]))
>>> xs[largest_indices(xs, 3)]
array([ 0.98935825,  0.90929743,  0.84147098])

Rhapsodic answered 10/8, 2016 at 21:42 Comment(0)

Three Answers Compared For Coding Ease And Speed

Speed was important for my needs, so I tested three answers to this question.

Code from those three answers was modified as needed for my specific case.

I then compared the speed of each method.

Coding wise:

NPE's answer was the next most elegant and adequately fast for my needs.
Fred Foos answer required the most refactoring for my needs but was the fastest. I went with this answer, because even though it took more work, it was not too bad and had significant speed advantages.
off99555's answer was the most elegant, but it is the slowest.

Complete Code for Test and Comparisons

import numpy as np
import time
import random
import sys
from operator import itemgetter
from heapq import nlargest

''' Fake Data Setup '''
a1 = list(range(1000000))
random.shuffle(a1)
a1 = np.array(a1)

''' ################################################ '''
''' NPE's Answer Modified A Bit For My Case '''
t0 = time.time()
indices = np.flip(np.argsort(a1))[:5]
results = []
for index in indices:
    results.append((index, a1[index]))
t1 = time.time()
print("NPE's Answer:")
print(results)
print(t1 - t0)
print()

''' Fred Foos Answer Modified A Bit For My Case'''
t0 = time.time()
indices = np.argpartition(a1, -6)[-5:]
results = []
for index in indices:
    results.append((a1[index], index))
results.sort(reverse=True)
results = [(b, a) for a, b in results]
t1 = time.time()
print("Fred Foo's Answer:")
print(results)
print(t1 - t0)
print()

''' off99555's Answer - No Modification Needed For My Needs '''
t0 = time.time()
result = nlargest(5, enumerate(a1), itemgetter(1))
t1 = time.time()
print("off99555's Answer:")
print(result)
print(t1 - t0)

Output with Speed Reports

NPE's Answer:

[(631934, 999999), (788104, 999998), (413003, 999997), (536514, 999996), (81029, 999995)]
0.1349949836730957

Fred Foo's Answer:

[(631934, 999999), (788104, 999998), (413003, 999997), (536514, 999996), (81029, 999995)]
0.011161565780639648

off99555's Answer:

[(631934, 999999), (788104, 999998), (413003, 999997), (536514, 999996), (81029, 999995)]
0.439760684967041

Ungula answered 7/11, 2020 at 5:35 Comment(0)

If you don't care about the order of the K-th largest elements you can use argpartition, which should perform better than a full sort through argsort.

K = 4 # We want the indices of the four largest values
a = np.array([0, 8, 0, 4, 5, 8, 8, 0, 4, 2])
np.argpartition(a,-K)[-K:]
array([4, 1, 5, 6])

Credits go to this question.

I ran a few tests and it looks like argpartition outperforms argsort as the size of the array and the value of K increase.

Quod answered 13/5, 2016 at 13:16 Comment(1)

If you do care about the order, just argpartition then sort :) – Leet 27/11, 2023 at 9:17

For multidimensional arrays you can use the axis keyword in order to apply the partitioning along the expected axis.

# For a 2D array
indices = np.argpartition(arr, -N, axis=1)[:, -N:]

And for grabbing the items:

x = arr.shape[0]
arr[np.repeat(np.arange(x), N), indices.ravel()].reshape(x, N)

But note that this won't return a sorted result. In that case you can use np.argsort() along the intended axis:

indices = np.argsort(arr, axis=1)[:, -N:]

# Result
x = arr.shape[0]
arr[np.repeat(np.arange(x), N), indices.ravel()].reshape(x, N)

Here is an example:

In [42]: a = np.random.randint(0, 20, (10, 10))

In [44]: a
Out[44]:
array([[ 7, 11, 12,  0,  2,  3,  4, 10,  6, 10],
       [16, 16,  4,  3, 18,  5, 10,  4, 14,  9],
       [ 2,  9, 15, 12, 18,  3, 13, 11,  5, 10],
       [14,  0,  9, 11,  1,  4,  9, 19, 18, 12],
       [ 0, 10,  5, 15,  9, 18,  5,  2, 16, 19],
       [14, 19,  3, 11, 13, 11, 13, 11,  1, 14],
       [ 7, 15, 18,  6,  5, 13,  1,  7,  9, 19],
       [11, 17, 11, 16, 14,  3, 16,  1, 12, 19],
       [ 2,  4, 14,  8,  6,  9, 14,  9,  1,  5],
       [ 1, 10, 15,  0,  1,  9, 18,  2,  2, 12]])

In [45]: np.argpartition(a, np.argmin(a, axis=0))[:, 1:] # 1 is because the first item is the minimum one.
Out[45]:
array([[4, 5, 6, 8, 0, 7, 9, 1, 2],
       [2, 7, 5, 9, 6, 8, 1, 0, 4],
       [5, 8, 1, 9, 7, 3, 6, 2, 4],
       [4, 5, 2, 6, 3, 9, 0, 8, 7],
       [7, 2, 6, 4, 1, 3, 8, 5, 9],
       [2, 3, 5, 7, 6, 4, 0, 9, 1],
       [4, 3, 0, 7, 8, 5, 1, 2, 9],
       [5, 2, 0, 8, 4, 6, 3, 1, 9],
       [0, 1, 9, 4, 3, 7, 5, 2, 6],
       [0, 4, 7, 8, 5, 1, 9, 2, 6]])

In [46]: np.argpartition(a, np.argmin(a, axis=0))[:, -3:]
Out[46]:
array([[9, 1, 2],
       [1, 0, 4],
       [6, 2, 4],
       [0, 8, 7],
       [8, 5, 9],
       [0, 9, 1],
       [1, 2, 9],
       [3, 1, 9],
       [5, 2, 6],
       [9, 2, 6]])

In [89]: a[np.repeat(np.arange(x), 3), ind.ravel()].reshape(x, 3)
Out[89]:
array([[10, 11, 12],
       [16, 16, 18],
       [13, 15, 18],
       [14, 18, 19],
       [16, 18, 19],
       [14, 14, 19],
       [15, 18, 19],
       [16, 17, 19],
       [ 9, 14, 14],
       [12, 15, 18]])

Granese answered 11/12, 2016 at 14:34 Comment(2)

I think you can simplify the indexing here by using np.take_along_axis (which likely did not exist when you answered this question) – Tropophilous 19/12, 2019 at 11:33

The default axis parameter for np.argpartition is -1 so no need set it to 1 in your 2D array case. – Claytor 27/1, 2022 at 18:42

Method np.argpartition only returns the k largest indices, performs a local sort, and is faster than np.argsort(performing a full sort) when array is quite large. But the returned indices are NOT in ascending/descending order. Let's say with an example:

We can see that if you want a strict ascending order top k indices, np.argpartition won't return what you want.

Apart from doing a sort manually after np.argpartition, my solution is to use PyTorch, torch.topk, a tool for neural network construction, providing NumPy-like APIs with both CPU and GPU support. It's as fast as NumPy with MKL, and offers a GPU boost if you need large matrix/vector calculations.

Strict ascend/descend top k indices code will be:

Note that torch.topk accepts a torch tensor, and returns both top k values and top k indices in type torch.Tensor. Similar with np, torch.topk also accepts an axis argument so that you can handle multi-dimensional arrays/tensors.

Chaeta answered 25/1, 2018 at 5:0 Comment(1)

Code snippets are replicate when you share screenshots. Code blocks will be much appreciated. – Claytor 27/1, 2022 at 18:40

This will be faster than a full sort depending on the size of your original array and the size of your selection:

>>> A = np.random.randint(0,10,10)
>>> A
array([5, 1, 5, 5, 2, 3, 2, 4, 1, 0])
>>> B = np.zeros(3, int)
>>> for i in xrange(3):
...     idx = np.argmax(A)
...     B[i]=idx; A[idx]=0 #something smaller than A.min()
...     
>>> B
array([0, 2, 3])

It, of course, involves tampering with your original array. Which you could fix (if needed) by making a copy or replacing back the original values. ...whichever is cheaper for your use case.

Sieber answered 2/8, 2011 at 13:54 Comment(3)

FWIW, your solution won't provide unambiguous solution in all situations. OP should describe how to handle these unambiguous cases. Thanks – Falter 2/8, 2011 at 17:9

@Falter The OP's question is a little ambiguous. An implementation, however, is not really open to interpretation. :) The OP should simply refer to the definition of np.argmax docs.scipy.org/doc/numpy/reference/generated/numpy.argmax.html to be sure this specific solution meets the requirements. It's possible that any solution meeting the OP's stated reqirement is acceptable.. – Sieber 2/8, 2011 at 18:5

Well, one might consider the implementation of argmax(.) to be unambiguous as well. (IMHO it tries to follow some kind of short circuiting logic, but unfortunately fails to provide universally acceptable behavior). Thanks – Falter 2/8, 2011 at 18:50

Use:

from operator import itemgetter
from heapq import nlargest
result = nlargest(N, enumerate(your_list), itemgetter(1))

Now the result list would contain N tuples (index, value) where value is maximized.

Verdict answered 17/4, 2016 at 10:6 Comment(0)

Use:

def max_indices(arr, k):
    '''
    Returns the indices of the k first largest elements of arr
    (in descending order in values)
    '''
    assert k <= arr.size, 'k should be smaller or equal to the array size'
    arr_ = arr.astype(float)  # make a copy of arr
    max_idxs = []
    for _ in range(k):
        max_element = np.max(arr_)
        if np.isinf(max_element):
            break
        else:
            idx = np.where(arr_ == max_element)
        max_idxs.append(idx)
        arr_[idx] = -np.inf
    return max_idxs

It also works with 2D arrays. For example,

In [0]: A = np.array([[ 0.51845014,  0.72528114],
                     [ 0.88421561,  0.18798661],
                     [ 0.89832036,  0.19448609],
                     [ 0.89832036,  0.19448609]])
In [1]: max_indices(A, 8)
Out[1]:
    [(array([2, 3], dtype=int64), array([0, 0], dtype=int64)),
     (array([1], dtype=int64), array([0], dtype=int64)),
     (array([0], dtype=int64), array([1], dtype=int64)),
     (array([0], dtype=int64), array([0], dtype=int64)),
     (array([2, 3], dtype=int64), array([1, 1], dtype=int64)),
     (array([1], dtype=int64), array([1], dtype=int64))]

In [2]: A[max_indices(A, 8)[0]][0]
Out[2]: array([ 0.89832036])

Binary answered 30/1, 2018 at 14:15 Comment(2)

Works good, but gives more results if you have duplicate (maximum) values in your array A. I would expect exactly k results but in case of duplicate values, you get more than k results. – Kentigera 21/2, 2018 at 12:53

I slightly modified the code. The list of indices that is returned has length equal exactly to k. If you have duplicates, they are grouped into a single tuple. – Binary 21/2, 2018 at 16:4

A vectorized 2D implementation using argpartition:

k = 3
probas = np.array([
    [.6, .1, .15, .15],
    [.1, .6, .15, .15],
    [.3, .1, .6, 0],
])

k_indices = np.argpartition(-probas, k-1, axis=-1)[:, :k]

# adjust indices to apply in flat array
adjuster = np.arange(probas.shape[0]) * probas.shape[1]
adjuster = np.broadcast_to(adjuster[:, None], k_indices.shape)
k_indices_flat = k_indices + adjuster

k_values = probas.flatten()[k_indices_flat]

# k_indices:
# array([[0, 2, 3],
#        [1, 2, 3],
#        [2, 0, 1]])
# k_values:
# array([[0.6 , 0.15, 0.15],
#        [0.6 , 0.15, 0.15],
#       [0.6 , 0.3 , 0.1 ]])

Claytor answered 27/1, 2022 at 19:48 Comment(0)

I found it most intuitive to use np.unique.

The idea is, that the unique method returns the indices of the input values. Then from the max unique value and the indicies, the position of the original values can be recreated.

multi_max = [1,1,2,2,4,0,0,4]
uniques, idx = np.unique(multi_max, return_inverse=True)
print np.squeeze(np.argwhere(idx == np.argmax(uniques)))
>> [4 7]

Homing answered 12/1, 2018 at 14:39 Comment(1)

I used this answer to get the top n distinct values, using the sorted idx to index uniques: top5distinct = uniques[np.argsort(idx)[::-1][:5]]. – Right 20/12, 2022 at 10:18

The following is a very easy way to see the maximum elements and its positions. Here axis is the domain; axis = 0 means column wise maximum number and axis = 1 means row wise max number for the 2D case. And for higher dimensions it depends upon you.

M = np.random.random((3, 4))
print(M)
print(M.max(axis=1), M.argmax(axis=1))

Pentathlon answered 16/6, 2018 at 8:20 Comment(1)

I used this link jakevdp.github.io/PythonDataScienceHandbook/… – Pentathlon 16/6, 2018 at 8:22

Here's a more complicated way that increases n if the nth value has ties:

>>>> def get_top_n_plus_ties(arr,n):
>>>>     sorted_args = np.argsort(-arr)
>>>>     thresh = arr[sorted_args[n]]
>>>>     n_ = np.sum(arr >= thresh)
>>>>     return sorted_args[:n_]
>>>> get_top_n_plus_ties(np.array([2,9,8,3,0,2,8,3,1,9,5]),3)
array([1, 9, 2, 6])

Falsework answered 19/11, 2020 at 20:57 Comment(0)

I think the most time efficiency way is manually iterate through the array and keep a k-size min-heap, as other people have mentioned.

And I also come up with a brute force approach:

top_k_index_list = [ ]
for i in range(k):
    top_k_index_list.append(np.argmax(my_array))
    my_array[top_k_index_list[-1]] = -float('inf')

Set the largest element to a large negative value after you use argmax to get its index. And then the next call of argmax will return the second largest element. And you can log the original value of these elements and recover them if you want.

Carcinoma answered 25/4, 2018 at 10:9 Comment(1)

TypeError: 'float' object cannot be interpreted as an integer – Occultism 25/2, 2022 at 21:54

This code works for a numpy 2D matrix array:

mat = np.array([[1, 3], [2, 5]]) # numpy matrix
 
n = 2  # n
n_largest_mat = np.sort(mat, axis=None)[-n:] # n_largest 
tf_n_largest = np.zeros((2,2), dtype=bool) # all false matrix
for x in n_largest_mat: 
  tf_n_largest = (tf_n_largest) | (mat == x) # true-false  

n_largest_elems = mat[tf_n_largest] # true-false indexing

This produces a true-false n_largest matrix indexing that also works to extract n_largest elements from a matrix array

Donetsk answered 23/10, 2019 at 4:28 Comment(0)

When top_k<<axis_length,it better than argsort.

import numpy as np

def get_sorted_top_k(array, top_k=1, axis=-1, reverse=False):
    if reverse:
        axis_length = array.shape[axis]
        partition_index = np.take(np.argpartition(array, kth=-top_k, axis=axis),
                                  range(axis_length - top_k, axis_length), axis)
    else:
        partition_index = np.take(np.argpartition(array, kth=top_k, axis=axis), range(0, top_k), axis)
    top_scores = np.take_along_axis(array, partition_index, axis)
    # resort partition
    sorted_index = np.argsort(top_scores, axis=axis)
    if reverse:
        sorted_index = np.flip(sorted_index, axis=axis)
    top_sorted_scores = np.take_along_axis(top_scores, sorted_index, axis)
    top_sorted_indexes = np.take_along_axis(partition_index, sorted_index, axis)
    return top_sorted_scores, top_sorted_indexes

if __name__ == "__main__":
    import time
    from sklearn.metrics.pairwise import cosine_similarity

    x = np.random.rand(10, 128)
    y = np.random.rand(1000000, 128)
    z = cosine_similarity(x, y)
    start_time = time.time()
    sorted_index_1 = get_sorted_top_k(z, top_k=3, axis=1, reverse=True)[1]
    print(time.time() - start_time)

Higa answered 13/1, 2021 at 9:5 Comment(0)

You can simply use a dictionary to find top k values & indices in a numpy array. For example, if you want to find top 2 maximum values & indices

import numpy as np
nums = np.array([0.2, 0.3, 0.25, 0.15, 0.1])


def TopK(x, k):
    a = dict([(i, j) for i, j in enumerate(x)])
    sorted_a = dict(sorted(a.items(), key = lambda kv:kv[1], reverse=True))
    indices = list(sorted_a.keys())[:k]
    values = list(sorted_a.values())[:k]
    return (indices, values)

print(f"Indices: {TopK(nums, k = 2)[0]}")
print(f"Values: {TopK(nums, k = 2)[1]}")


Indices: [1, 2]
Values: [0.3, 0.25]

Plush answered 25/8, 2021 at 19:15 Comment(0)

If you are dealing with NaNs and/or have problems understanding np.argpartition, try pandas.DataFrame.sort_values.

import numpy as np
import pandas as pd    

a = np.array([9, 4, 4, 3, 3, 9, 0, 4, 6, 0])

df = pd.DataFrame(a, columns=['array'])
max_values = df['array'].sort_values(ascending=False, na_position='last')
ind = max_values[0:3].index.to_list()

This example gives the indices of the 3 largest, not-NaN values. Probably inefficient, but easy to read and customize.

Tintometer answered 9/11, 2022 at 16:26 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Three Answers Compared For Coding Ease And Speed

Complete Code for Test and Comparisons

Output with Speed Reports

Recommended topics

Hot tags