Why does python's bisect_left return a valid index even if the value does not exist?
Asked Answered
G

2

7

I want to find if a number exists in a sorted array. To be straight an array contains fibonaccci number from 1 to 63. Below is the fibonacci number generator and some of it's output.

stacksize = 10000  # default 128 stack
from functools import lru_cache

@lru_cache(stacksize)
def nthfibonacci(n):
    if n <= 1:
        return 1
    elif n == 2:
        return 1
    elif n > 2:
        return nthfibonacci(n - 2) + nthfibonacci(n - 1)

 output = [nthfibonacci(k) for k in range(1,63+1)]

 # truncated output: [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610,987, 
           1597, 2584, 4181, 6765, 10946, 17711, 28657, 46368,.....]

Now I want to find if number 7 exists or not so I used the following code using python bisection module:

from bisect import bisect_left
elem_index = bisect_left(a=output, x=7, lo=0, hi=len(arr) - 1) 
# output of elem_index is 5 ???? But it is expected to be len(output)+1, right?   
# as we know if element is not found it returns len(array)+1 

Again if I just write a simple binary search it gives me correct result as follows:

def binsearch(arr, key):
    # arr.sort()
    low = 0
    high = len(arr) - 1
    while low <= high:
        mid = (low + high) // 2
        if arr[mid] == key:
            return mid
        else:
            if arr[mid] < key:
                low = mid + 1
            else:
                high = mid - 1
    return -1

print(binsearch(arr, 7)) # it gives me -1 as expected

So what is happening?

Geopolitics answered 11/6, 2019 at 6:0 Comment(0)
S
1

The documentation on bisect_left explains the behaviour:

bisect_left(...)
    bisect_left(a, x[, lo[, hi]]) -> index

    Return the index where to insert item x in list a, assuming a is sorted.

In short, bisect_left (and bisect_right) tells you where an element is if it exists, or where to insert it if it doesn't.

Consider a contrived example. Let's search for a value in a sorted list when that value exists.

l = [1, 4, 5]
bisect.bisect_left(l, 4)
# 1

bisect_left returns 1 because l[1] is 4. Now, repeat the process, but with a value that does not exist in that list.

bisect.bisect_left(l, 3)
# 1

In this case, l[1] is where you would find 3 if it existed in that sorted list.


What does this mean for you? All you have do to is modify your function to query the element at the index returned by bisect_left,

def binary_search(items, key):
    idx = bisect_left(items, key)
    if items[idx] != key:
         return -1

    return idx
Ster answered 11/6, 2019 at 6:7 Comment(2)
Totally make sense now. So we can do if arr[bisect_result] != key: print("Not Found"). Right @cs95?Geopolitics
the last suggestion will throw an exception if key is greater than any value in itemsCarrelli
M
2

As already noted, bisect_left() just locates the insertion point for an element in a list to maintain sorted order. The documentation also shows how to transform bisect_* functions to actual lookups, so you could use:

def index(lst, elem):
    i = bisect_left(lst, elem)
    if i != len(lst) and lst[i] == elem:
        return i
    return -1
Marshland answered 30/11, 2021 at 20:58 Comment(0)
S
1

The documentation on bisect_left explains the behaviour:

bisect_left(...)
    bisect_left(a, x[, lo[, hi]]) -> index

    Return the index where to insert item x in list a, assuming a is sorted.

In short, bisect_left (and bisect_right) tells you where an element is if it exists, or where to insert it if it doesn't.

Consider a contrived example. Let's search for a value in a sorted list when that value exists.

l = [1, 4, 5]
bisect.bisect_left(l, 4)
# 1

bisect_left returns 1 because l[1] is 4. Now, repeat the process, but with a value that does not exist in that list.

bisect.bisect_left(l, 3)
# 1

In this case, l[1] is where you would find 3 if it existed in that sorted list.


What does this mean for you? All you have do to is modify your function to query the element at the index returned by bisect_left,

def binary_search(items, key):
    idx = bisect_left(items, key)
    if items[idx] != key:
         return -1

    return idx
Ster answered 11/6, 2019 at 6:7 Comment(2)
Totally make sense now. So we can do if arr[bisect_result] != key: print("Not Found"). Right @cs95?Geopolitics
the last suggestion will throw an exception if key is greater than any value in itemsCarrelli

© 2022 - 2024 — McMap. All rights reserved.