Fast counts of elements of numpy array by value thresholds in another array

Asked 19/8, 2016 at 13:58 Answered 19/8, 2016 at 14:12

Solved python numpy cumulative-frequency

Given a numpy array of threshold values, what is the most efficient way to produce an array of the counts of another array meeting these values?

Assume the threshold value array is small and sorted, and the array of values to be counted is large-ish and unsorted.

Example: for each element of valueLevels, count the elements of values greater than or equal to it:

import numpy as np

n = int(1e5) # size of example

# example levels: the sequence 0, 1., 2.5, 5., 7.5, 10, 5, ... 50000, 75000
valueLevels =  np.concatenate(
                   [np.array([0.]), 
                    np.concatenate([ [ x*10**y for x in [1., 2.5, 5., 7.5] ] 
                                   for y in range(5) ] ) 
                    ]
                )

np.random.seed(123)
values = np.random.uniform(low=0, high=1e5, size=n)

So far I have tried the list comprehension approach.

np.array([sum(values>=x) for x in valueLevels])was unacceptably slow
np.array([len(values[values>=x]) for x in valueLevels]) was an improvement
sorting values did speed up the comprehension (in the example, from ~7 to 0.5 ms), but the cost of sort (~8 ms) exceeded the savings for one-time use

The best I have right now is a comprehension of this approach:

%%timeit 
np.array([np.count_nonzero(values>=x) for x in valueLevels])
# 1000 loops, best of 3: 1.26 ms per loop

which is acceptable for my purposes, but out of curiosity,

What I would like to know is

If list comprehension is the way to go, can it be sped up? Or,
Are other approaches faster? (I have a vague sense that this could be done by broadcasting the values array over the thresholds array, but I can't figure out how to get the dimensions right for np.broadcast_arrays().

Gertrude answered 19/8, 2016 at 13:58 Comment(0)

The fastest I have so far is

%timeit count_nonzero(values >= atleast_2d(valueLevels).T, axis=1)
# 1000 loops, best of 3: 860 µs per loop

sum is slower:

%timeit sum(values >= atleast_2d(valueLevels).T, axis=1)
# 100 loops, best of 3: 2.5 ms per loop

@Divakar's version is even slower:

%timeit count_nonzero(values[:, None] >= valueLevels, axis=1)
# 100 loops, best of 3: 3.86 ms per loop

However, I would probably still use your list comprehension, which is not much slower and does not create a big 2D boolean array as an intermediate step:

%timeit np.array([np.count_nonzero(values>=x) for x in valueLevels])
# 1000 loops, best of 3: 987 µs per loop

Atrip answered 19/8, 2016 at 14:12 Comment(2)

It's good to know how to do the 2D approach, +1. Performance-wise, I'm not seeing a conclusive difference either. I'll accept this if nothing superior comes along. Thanks. – Gertrude 19/8, 2016 at 14:35

Yup, I added the new axis along the longer array by mistake, paying the penalty there. Good job with the count_nonzero! – Europeanize 19/8, 2016 at 14:35

Approach #1 Using np.searchsorted -

values.size - np.searchsorted(values,valueLevels,sorter=values.argsort())

Approach #2 Using NumPy broadcasting -

(values[:,None]>=valueLevels).sum(0)

Europeanize answered 19/8, 2016 at 14:7 Comment(5)

Approach #1 needs whole 8.06 ms per loop – Atrip 19/8, 2016 at 14:34

This helps answer the question by establishing that the broadcasting approach is not necessarily faster -- +1 – Gertrude 19/8, 2016 at 14:36

@TimFuchs And sorting on a huge array didn't help either :) – Europeanize 19/8, 2016 at 14:39

@Gertrude If values were already sorted, we would avoid that sorter in np.searchsorted and the speedup would be huge then. – Europeanize 19/8, 2016 at 14:42

@Gertrude Also, broadcasting is faster, as done in @ Tim's post using atleast_2d. – Europeanize 19/8, 2016 at 14:55

Recommended topics

Hot tags