When to use "=" in binary search condition?

Asked 24/2, 2016 at 21:32 Answered 13/6, 2022 at 22:38

I am quite confused about the scenarios when to use the = in binary search. For example, this is what i found from wiki, in which it is using while (imin <= imax)

int binary_search(int A[], int key, int imin, int imax)
{
  // continue searching while [imin,imax] is not empty
  while (imin <= imax)
    {
     int imid = midpoint(imin, imax);
  if (A[imid] == key) 
    return imid;
  else if (A[imid] < key)
     imin = imid + 1;
  else        
    imax = imid - 1; 
    }

  return KEY_NOT_FOUND;
}

However, I also found a lot of code using something like

while (imin < imax)

My questions are: what's the concern to use the = or not? Any reason behind?

Thanks so much!

Agony answered 24/2, 2016 at 21:32 Comment(2)

well, consider the difference between (x < 1) and (x <= 1).. consider which values will cause those two to go true, and which will cause them to go false... – Krutz 24/2, 2016 at 21:36

This answer is better. – Eisenhower 31/8, 2021 at 16:2

Note these two algorithms on wiki:

Iterative binary search:

int binary_search(int A[], int key, int imin, int imax)
{
  // continue searching while [imin,imax] is not empty
  while (imin <= imax)
    {
      // calculate the midpoint for roughly equal partition
      int imid = midpoint(imin, imax);
      if (A[imid] == key)
        // key found at index imid
        return imid;
      // determine which subarray to search
      else if (A[imid] < key)
        // change min index to search upper subarray
        imin = imid + 1;
      else        
        // change max index to search lower subarray
        imax = imid - 1;
    }
  // key was not found
  return KEY_NOT_FOUND;
}

Iterative binary search with deferred detection of equality:

int binary_search(int A[], int key, int imin, int imax)
{
  // continually narrow search until just one element remains
  while (imin < imax)
    {
      int imid = midpoint(imin, imax);

      // code must guarantee the interval is reduced at each iteration
      assert(imid < imax);
      // note: 0 <= imin < imax implies imid will always be less than imax

      // reduce the search
      if (A[imid] < key)
        imin = imid + 1;
      else
        imax = imid;
    }
  // At exit of while:
  //   if A[] is empty, then imax < imin
  //   otherwise imax == imin

  // deferred test for equality
  if ((imax == imin) && (A[imin] == key))
    return imin;
  else
    return KEY_NOT_FOUND;
}

You have three cases to consider, when imin < imax, imin == imax and imin > imax. The first algorithm deals with less than and equality within the while loop, whereas in the second algorithm, the equality case is deferred to the if statement. As wiki states:

The iterative and recursive versions take three paths based on the key comparison: one path for less than, one path for greater than, and one path for equality. (There are two conditional branches.) The path for equality is taken only when the record is finally matched, so it is rarely taken. That branch path can be moved outside the search loop in the deferred test for equality version of the algorithm.

The deferred detection approach foregoes the possibility of early termination on discovery of a match, so the search will take about log2(N) iterations. On average, a successful early termination search will not save many iterations. For large arrays that are a power of 2, the savings is about two iterations. Half the time, a match is found with one iteration left to go; one quarter the time with two iterations left, one eighth with three iterations, and so forth. The infinite series sum is 2.

The deferred detection algorithm has the advantage that if the keys are not unique, it returns the smallest index (the starting index) of the region where elements have the search key. The early termination version would return the first match it found, and that match might be anywhere in region of equal keys.

So the use of either <= in a while loop, or simply <, will depend on your choice of implementation.

Ensphere answered 24/2, 2016 at 21:59 Comment(0)

If we want to determine if a specific value exists in the sorted array or not, we would want to use <=, here's a visual walkthrough I made that really drills in as to why <= should be used https://youtu.be/7jci-yQhGho

Guideline answered 13/6, 2022 at 22:38 Comment(1)

When linking to your own site or content (or content that you are affiliated with), you must disclose your affiliation in the answer in order for it not to be considered spam. Having the same text in your username as the URL or mentioning it in your profile is not considered sufficient disclosure under Stack Exchange policy. – Leanneleanor 14/6, 2022 at 3:2

When using binary search, sometimes it's important to look at what low < high and low <= high may bring.

For example, Say you're at an iteration where you have an array like [50,10] where low and mid are at 50 : INDEX 0 and high is at 10 : INDEX 1.
Now if you were to use a while(low < high), let's say the condition would set low = mid + 1 since arr[low] > arr[high]. So now the low would be equal to high and the loop will break. Leaving mid at index 0.
If you require mid after the loop, the answer would simply be wrong, since it's at index 0 but low and high are both at index 1 , indicating that the smaller number was 10 instead of 50.

So in this case, we need while(loop <= high) so that mid will still be calculated when low == high, giving us low = mid = high.

Reference: https://medium.com/@hazemu/useful-insights-into-binary-search-problems-8769d388b9c

Gazebo answered 12/11, 2021 at 10:50 Comment(0)

Recommended topics

Hot tags