get index of the first block of at least n consecutive False values in boolean array

Asked 6/4, 2018 at 13:20 Answered 6/4, 2018 at 14:32

I have a numpy boolean array

w=np.array([True,False,True,True,False,False,False])

I would like to get the index of the first time there are at n_at_least false values. For instance here

`n_at_least`=1 -> desired_index=1

`n_at_least`=3 -> desired_index=4

I have tried

np.cumsum(~w)

which does increase every time a False value is encountered. However, when True is encountered the counter is not starting from 0 again so I only get the total count of False elements rather than the count of the last consecutive ones.

Wastebasket answered 6/4, 2018 at 13:20 Comment(3)

Why desired_index=4 when a_at_least is 3 ? – Hedonism 6/4, 2018 at 13:27

@liliscent I think n_at_least is the required number of time False occurs. And so the starting index of the sequence where False occurs 3 times in a row is 4 – Physicality 6/4, 2018 at 13:28

@Physicality Sounds right. I misunderstood to the last index of all False. – Hedonism 6/4, 2018 at 13:31

I think for this linear search operation a python implementation is ok. My suggestion looks like this:

def find_block(arr, n_at_least=1):
    current_index = 0
    current_count = 0
    for index, item in enumerate(arr):
         if item:
             current_count = 0
             current_index = index + 1
         else:
             current_count += 1
         if current_count == n_at_least:
             return current_index
    return None # Make sure this is outside for loop

Running this function yields the following outputs:

>>> import numpy
>>> w = numpy.array([True, False, True, True, False, False, False])
>>> find_block(w, n_at_least=1)
1
>>> find_block(w, n_at_least=3)
4
>>> find_block(w, n_at_least=4)
>>> # None

Alburga answered 6/4, 2018 at 14:32 Comment(0)

Here's a vectorized solution that finds the start, stop indices and hence lengths of islands of zeros and finally uses argmax to get the starting index of the first island satisfying the criteria of zeros count being >= n -

def first_occ_index(w, n):
    idx = np.flatnonzero(np.r_[True, w, True])
    lens = np.diff(idx) - 1
    return idx[(lens >= n).argmax()]

Sample run -

In [107]: w
Out[107]: array([ True, False,  True,  True, False, False, False])

In [108]: first_occ_index(w, n=1)
Out[108]: 1

In [109]: first_occ_index(w, n=3)
Out[109]: 4

Yorgen answered 6/4, 2018 at 14:0 Comment(0)

I think you're falling into the numpy trap of only wanting to use numpy functions. What's wrong with python? This solution is O(n)

def f(array, n_at_least):
    curr_found_false = 0
    curr_index = 0
    for index, elem in enumerate(array):
        if not elem:
            if curr_found_false == 0:
                curr_index = index
            curr_found_false += 1
            if curr_found_false == n_at_least:
                return curr_index
        else:
            curr_found_false = 0

Outputs

w=np.array([True,False,True,True,False,False,False])
f(w, 1)
# --> 1
f(w, 3)
# --> 4

Evade answered 6/4, 2018 at 13:27 Comment(3)

Good answer. But I would claim that most people are falling in to the list "trap" :). The fact numpy is efficient is because it's 3rd party. – Paugh 6/4, 2018 at 14:41

@Paugh I suppose that's true. But looking for a numpy solution isn't always worth your time for the benefits it brings. That being said I would use one if one is available. – Evade 6/4, 2018 at 14:54

Just FYI, it can still be done faster using numpy and convolution. See https://mcmap.net/q/1323378/-how-to-implement-this-array-algorithm-in-a-more-efficient-way – Cavendish 7/7, 2020 at 21:29

Here is an O(n) numpy solution:

>>> def first_consec(A, n):
...     A = np.r_[True, A, True]
...     switch, = np.where(A[:-1]!=A[1:])
...     runs = switch[1::2] - switch[::2]
...     idx = np.argmax(runs >= n)
...     if runs[idx] < n:
...         return None
...     return switch[2*idx]
... 
>>> first_consec(w, 4)
>>> first_consec(w, 3)
4
>>> first_consec(w, 2)
4
>>> first_consec(w, 1)
1

Vacation answered 6/4, 2018 at 14:2 Comment(0)

I think for this linear search operation a python implementation is ok. My suggestion looks like this:

def find_block(arr, n_at_least=1):
    current_index = 0
    current_count = 0
    for index, item in enumerate(arr):
         if item:
             current_count = 0
             current_index = index + 1
         else:
             current_count += 1
         if current_count == n_at_least:
             return current_index
    return None # Make sure this is outside for loop

Running this function yields the following outputs:

>>> import numpy
>>> w = numpy.array([True, False, True, True, False, False, False])
>>> find_block(w, n_at_least=1)
1
>>> find_block(w, n_at_least=3)
4
>>> find_block(w, n_at_least=4)
>>> # None

Alburga answered 6/4, 2018 at 14:32 Comment(0)

It should work this way:

def n_at_least(n):
    for i in range(len(w)):
         if not any(w[i:i+n]):
             return i

However I don't know if there is a better way...

Polaroid answered 6/4, 2018 at 13:25 Comment(1)

I like it although you don't need that break and this solution is O(n^2). Clean though. – Evade 6/4, 2018 at 13:36

This is one way using a generator expression with slicing:

w = np.array([True,False,True,True,False,False,False])

n = 2
val = False

res = next((i for i, j in enumerate(w[k:k+n] for k in range(len(w)-n+1)) \
            if np.all(j==val)), None)

# 4

Paugh answered 6/4, 2018 at 14:10 Comment(0)

Recommended topics

Hot tags