Find the first element in a sorted array that is greater than the target
Asked Answered
K

8

72

In a general binary search, we are looking for a value which appears in the array. Sometimes, however, we need to find the first element which is either greater or less than a target.

Here is my ugly, incomplete solution:

// Assume all elements are positive, i.e., greater than zero
int bs (int[] a, int t) {
  int s = 0, e = a.length;
  int firstlarge = 1 << 30;
  int firstlargeindex = -1;
  while (s < e) {
    int m = (s + e) / 2;
    if (a[m] > t) {
      // how can I know a[m] is the first larger than
      if(a[m] < firstlarge) {
        firstlarge = a[m];
        firstlargeindex = m;
      }
      e = m - 1; 
    } else if (a[m] < /* something */) {
      // go to the right part
      // how can i know is the first less than  
    }
  }
}

Is there a more elegant solution for this kind of problem?

Khalif answered 1/7, 2011 at 22:57 Comment(3)
Is the array sorted? If so then binary search, if not then linear search...Morphosis
it is a sorted array so that we can do binary search. In above code, I am using comparison to find the first greater or less element in the arrayKhalif
why not use upper_bound c++ STLDoublepark
D
106

One way of thinking about this problem is to think about doing a binary search over a transformed version of the array, where the array has been modified by applying the function

f(x) = 1 if x > target
       0 else

Now, the goal is to find the very first place that this function takes on the value 1. We can do that using a binary search as follows:

int low = 0, high = numElems; // numElems is the size of the array i.e arr.size() 
while (low != high) {
    int mid = (low + high) / 2; // Or a fancy way to avoid int overflow
    if (arr[mid] <= target) {
        /* This index, and everything below it, must not be the first element
         * greater than what we're looking for because this element is no greater
         * than the element.
         */
        low = mid + 1;
    }
    else {
        /* This element is at least as large as the element, so anything after it can't
         * be the first element that's at least as large.
         */
        high = mid;
    }
}
/* Now, low and high both point to the element in question. */

To see that this algorithm is correct, consider each comparison being made. If we find an element that's no greater than the target element, then it and everything below it can't possibly match, so there's no need to search that region. We can recursively search the right half. If we find an element that is larger than the element in question, then anything after it must also be larger, so they can't be the first element that's bigger and so we don't need to search them. The middle element is thus the last possible place it could be.

Note that on each iteration we drop off at least half the remaining elements from consideration. If the top branch executes, then the elements in the range [low, (low + high) / 2] are all discarded, causing us to lose floor((low + high) / 2) - low + 1 >= (low + high) / 2 - low = (high - low) / 2 elements.

If the bottom branch executes, then the elements in the range [(low + high) / 2 + 1, high] are all discarded. This loses us high - floor(low + high) / 2 + 1 >= high - (low + high) / 2 = (high - low) / 2 elements.

Consequently, we'll end up finding the first element greater than the target in O(lg n) iterations of this process.

Here's a trace of the algorithm running on the array 0 0 1 1 1 1.

Initially, we have

0 0 1 1 1 1
L = 0       H = 6

So we compute mid = (0 + 6) / 2 = 3, so we inspect the element at position 3, which has value 1. Since 1 > 0, we set high = mid = 3. We now have

0 0 1
L     H

We compute mid = (0 + 3) / 2 = 1, so we inspect element 1. Since this has value 0 <= 0, we set mid = low + 1 = 2. We're now left with L = 2 and H = 3:

0 0 1
    L H

Now, we compute mid = (2 + 3) / 2 = 2. The element at index 2 is 1, and since 10, we set H = mid = 2, at which point we stop, and indeed we're looking at the first element greater than 0.

Depoliti answered 1/7, 2011 at 23:9 Comment(17)
Can you please run an example of your solution, for example we have input 0,0,1,1,1,1. lets find the first element that is greater than 0.Khalif
similarly, we can compose the func to find the first element smaller than t: if(a[m]>=t) high = m-1; else low = m;Khalif
It is similar that "find the last element that is less than target"Khalif
@SecureFish: Just to add this minor point: For this opposite problem it is also necessary to adjust the calculation of mid. Due to the combination of the round-down-effect in the division and the subtracting, it is possible to get negative high values without a modification. This could be solved by changing to a round-up-behavior in this calculation, for instance by adding the modulus 2 term again.Agenesis
if we do a normal binary search and when array[mid]==target,also low=mid+1,then the low will always point to the first element greater than target,right?Freesia
What if there is no element in the array greater than the target?Protochordate
I don't understand your worked example. You give an array of six elements, so numElems=6, but you are initializing H=5.Protochordate
@EdAvis Oh wow, I had inclusive bounds in my trace and exclusive bounds in my code. Trace updated!Depoliti
The only thing I would replace in this response would be to use int mid=low+(high-low)/2; to avoid overflowing for large integers.Lubeck
You are missing an edge case where the input array only has 1 element. To handle that you have to compare the value array[start] with targetErotomania
@Erotomania I might be missing something, but doesn’t this code properly handle that case? It reports the first/only element in the case that it’s greater than the target and reports nothing otherwise.Depoliti
@Depoliti Is your code written in Java or other language? I don't see a return statement nor the function header, so I assume you are returning the low or high directly after your while loop. I think you need to have a check before or after the while loop. eg: if target is larger than the last element, there will be no answer, so you need to return something to indicate that. -1 or nullErotomania
@Erotomania When the loop finishes, low and high are equal and point right before the first element greater than the target (or, if one doesn’t exist, past the end of the array.)Depoliti
@Depoliti A few years late, but I think your algorithm assumes there is a valid solution. I don't think it accounts for the edge case where the last element in the array is <= the target. I believe your code would return the last element in the array in this case, but that wouldn't be correct?Abamp
Oh sorry. I just saw that you initialized high to numElems instead of the last index. Btw, for all binary search, is it better to initialize to numElems instead of the the last index (numElems-1)?Abamp
Because the entire is preprocessed before running Binary Search, the overall approach is O(n). What's more, you can find the first 1 during preprocessing.Brazee
@Brazee You’re right that you wouldn’t literally transform the array this way because it’s too inefficient. Rather, as we do the binary search, we pretend that we’re looking at an array of 0s and 1s where each element bigger than the target is 1 and each element less than the target is 0. This doesn’t require us to preprocess the whole array. Instead, we infer whether we’re looking at a 0 or 1 at each step of the binary search.Depoliti
C
14

After many years of teaching algorithms, my approach for solving binary search problems is to set the start and the end on the elements, not outside of the array. This way I can feel what's going on and everything is under control, without feeling magic about the solution.

The key point in solving binary search problems (and many other loop-based solutions) is a set of good invariants. Choosing the right invariant makes problem-solving a cake. It took me many years to grasp the invariant concept although I had learned it first in college many years ago.

Even if you want to solve binary search problems by choosing start or end outside of the array, you can still achieve it with a proper invariant. That being said, my choice is stated above to always set a start on the first element and end on the last element of the array.

So to summarize, so far we have:

int start = 0; 
int end = a.length - 1; 

Now the invariant. The array right now we have is [start, end]. We don't know anything yet about the elements. All of them might be greater than the target, or all might be smaller, or some smaller and some larger. So we can't make any assumptions so far about the elements. Our goal is to find the first element greater than the target. So we choose the invariants like this:

Any element to the right of the end is greater than the target.
Any element to the left of the start is smaller than or equal to the target.

We can easily see that our invariant is correct at the start (ie before going into any loop). All the elements to the left of the start (no elements basically) are smaller than or equal to the target, same reasoning for the end.

With this invariant, when the loop finishes, the first element after the end will be the answer (remember the invariant that the right side of the end are all greater than the target?). So answer = end + 1.

Also, we need to note that when the loop finishes, the start will be one more than the end. ie start = end + 1. So equivalently we can say start is the answer as well (invariant was that anything to the left of the start is smaller than or equal to the target, so start itself is the first element larger than the target).

So everything being said, here is the code.

public static int find(int a[], int target) {
    int st = 0; 
    int end = a.length - 1; 
    while(st <= end) {
        int mid = (st + end) / 2;   // or elegant way of st + (end - st) / 2; 
        if (a[mid] <= target) {
            st = mid + 1; 
        } else { // mid > target
            end = mid - 1; 
        }
    }
    return st; // or return end + 1
}

A few extra notes about this way of solving binary search problems:

This type of solution always shrinks the size of subarrays by at least 1. This is obvious in the code. The new start or end are either +1 or -1 in the mid. I like this approach better than including the mid in both or one side, and then reason later why the algo is correct. This way it's more tangible and more error-free.

The condition for the while loop is st <= end. Not st < end. That means the smallest size that enters the while loop is an array of size 1. And that totally aligns with what we expect. In other ways of solving binary search problems, sometimes the smallest size is an array of size 2 (if st < end), and honestly I find it much easier to always address all array sizes including size 1.

So hope this clarifies the solution for this problem and many other binary search problems. Treat this solution as a way to professionally understand and solve many more binary search problems without ever wobbling whether the algorithm works for edge cases or not.

Cothran answered 23/1, 2019 at 13:36 Comment(4)
Thank you for explaining it using invariants! Makes everything so much more clear.Preponderant
@Cothran How would you handle start and end if you wanted to find the first element that was equal to the target? You need a way to indicate that the element is not in the array, which is typically done with -1 (but doesn't have to be).Overwinter
@Overwinter define the invariant like this: anything to the left of start is smaller than the target. (so anything to the right of end is greater than or equal). When the loop finishes, end + 1 will be on the first element equal or greater than the target. After the loop you can check whether "end + 1" is equal to the target. If not return -1.Cothran
@Overwinter I have also this answer on binary search: https://mcmap.net/q/276035/-binary-search-bounds, which clarifies a bit more on predicates. Remember in binary search problems the goal is to find the border between 0s and 1s. If you can think of assigning 0s and 1s (in other words a good predicate to give 0 or 1 to each element) then solving the binary search problem would be very easy. In the problem you mentioned, everything smaller than target is a 0, anything larger than or equal to the target is a 1. So the goal is to find the first 1. [0 0 0 0 1 1 1 ]. Hope this clarifies the point.Cothran
A
13

You can use std::upper_bound if the array is sorted (assuming n is the size of array a[]):

int* p = std::upper_bound( a, a + n, x );
if( p == a + n )
     std::cout << "No element greater";
else
     std::cout << "The first element greater is " << *p
               << " at position " << p - a;
Acey answered 1/7, 2011 at 23:19 Comment(0)
D
2

How about the following recursive approach:

public static int minElementGreaterThanOrEqualToKey(int A[], int key,
    int imin, int imax) {

    // Return -1 if the maximum value is less than the minimum or if the key
    // is great than the maximum
    if (imax < imin || key > A[imax])
        return -1;

    // Return the first element of the array if that element is greater than
    // or equal to the key.
    if (key < A[imin])
        return imin;

    // When the minimum and maximum values become equal, we have located the element. 
    if (imax == imin)
        return imax;

    else {
        // calculate midpoint to cut set in half, avoiding integer overflow
        int imid = imin + ((imax - imin) / 2);

        // if key is in upper subset, then recursively search in that subset
        if (A[imid] < key)
            return minElementGreaterThanOrEqualToKey(A, key, imid + 1, imax);

        // if key is in lower subset, then recursively search in that subset
        else
            return minElementGreaterThanOrEqualToKey(A, key, imin, imid);
    }
}
Dramaturgy answered 6/3, 2013 at 11:37 Comment(0)
R
2

Hhere is a modified binary search code in JAVA with time complexity O(logn) that :

  • returns index of element to be searched if element is present
  • returns index of next greater element if searched element is not present in array
  • returns -1 if an element greater than the largest element of array is searched
public static int search(int arr[],int key) {
    int low=0,high=arr.length,mid=-1;
    boolean flag=false;
    
    while(low<high) {
        mid=(low+high)/2;
        if(arr[mid]==key) {
            flag=true;
            break;
        } else if(arr[mid]<key) {
            low=mid+1;
        } else {
            high=mid;
        }
    }
    if(flag) {
        return mid;
    }
    else {
        if(low>=arr.length)
            return -1;
        else
        return low;
        //high will give next smaller
    }
}

public static void main(String args[]) throws IOException {
    BufferedReader br=new BufferedReader(new InputStreamReader(System.in));
    //int n=Integer.parseInt(br.readLine());
    int arr[]={12,15,54,221,712};
    int key=71;
    System.out.println(search(arr,key));
    br.close();
}
Reflation answered 29/6, 2016 at 6:28 Comment(0)
D
1

My implementation uses condition bottom <= top which is different from the answer by templatetypedef.

int FirstElementGreaterThan(int n, const vector<int>& values) {
  int B = 0, T = values.size() - 1, M = 0;
  while (B <= T) { // B strictly increases, T strictly decreases
    M = B + (T - B) / 2;
    if (values[M] <= n) { // all values at or before M are not the target
      B = M + 1;
    } else {
      T = M - 1;// search for other elements before M
    }
  }
  return T + 1;
}
Daub answered 27/10, 2015 at 0:9 Comment(0)
F
1
public static int search(int target, int[] arr) {
        if (arr == null || arr.length == 0)
            return -1;
        int lower = 0, higher = arr.length - 1, last = -1;
        while (lower <= higher) {
            int mid = lower + (higher - lower) / 2;
            if (target == arr[mid]) {
                last = mid;
                lower = mid + 1;
            } else if (target < arr[mid]) {
                higher = mid - 1;
            } else {
                lower = mid + 1;
       }
    }
    return (last > -1 && last < arr.length - 1) ? last + 1 : -1;
}

If we find target == arr[mid], then any previous element would be either less than or equal to the target. Hence, the lower boundary is set as lower=mid+1. Also, last is the last index of 'target'. Finally, we return last+1 - taking care of boundary conditions.

Fanchette answered 30/6, 2020 at 1:48 Comment(0)
D
0
  • kind =0 : exact match
  • kind=1 : just grater than x
  • kind=-1 : just smaller than x;
  • It returns -1 if no match is found.
#include <iostream>
#include <algorithm>

using namespace std;


int g(int arr[], int l , int r, int x, int kind){
    switch(kind){
    case 0: // for exact match
        if(arr[l] == x) return l;
        else if(arr[r] == x) return r;
        else return -1;
        break;
    case 1: // for just greater than x
        if(arr[l]>=x) return l;
        else if(arr[r]>=x) return r;
        else return -1;
        break;
    case -1: // for just smaller than x
        if(arr[r]<=x) return r;
        else if(arr[l] <= x) return l;
        else return -1;
        break;
    default:
        cout <<"please give "kind" as 0, -1, 1 only" << ednl;
    }
}

int f(int arr[], int n, int l, int r, int x, int kind){
    if(l==r) return l;
    if(l>r) return -1;
    int m = l+(r-l)/2;
    while(m>l){
        if(arr[m] == x) return m;
        if(arr[m] > x) r = m;
        if(arr[m] < x) l = m;
        m = l+(r-l)/2;
    }
    int pos = g(arr, l, r, x, kind);
    return pos;
}

int main()
{
    int arr[] = {1,2,3,5,8,14, 22, 44, 55};
    int n = sizeof(arr)/sizeof(arr[0]);
    sort(arr, arr+n);
    int tcs;
    cin >> tcs;
    while(tcs--){
        int l = 0, r = n-1, x = 88, kind = -1; // you can modify these values
        cin >> x;
        int pos = f(arr, n, l, r, x, kind);
        // kind =0: exact match, kind=1: just grater than x, kind=-1: just smaller than x;
        cout <<"position"<< pos << " Value ";
        if(pos >= 0) cout << arr[pos];
        cout << endl;
    }
    return 0;
}
Deck answered 28/1, 2018 at 6:34 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.