When writing a binary search there are a couple of things to consider. The first is what interval range you are searching over and specifically, how you are defining it.
For example, it could be inclusive of both low
and high
, meaning [low, high]
, but it could also be exclusive of high
, [low, high)
. Which of these you choose will change the rest of your algorithm.
The obvious implication is the initial values. Generally, high
should be the length of the array if it is exclusive and it should be one less if it's inclusive, but it could be something entirely different depending on the problem you're solving.
For the while loop you want it to terminate when the search interval is empty, meaning there are no more candidates to check. If you are using the interval [low, high]
, then this will be empty when low
is strictly greater than high
(for example, [5,5]
contains 5, but [6,5]
contains nothing), so the while loop will check for the opposite, low <= high
. However, if you use the interval [low, high)
, then this interval is empty when low
is equal to high
, so the while loop needs to check for low < high
.
Within the while loop, after checking mid
, you want to remove it from the interval so you don't check it again. If high
is inclusive, then you have to use one less than mid
as the new high
in order to exclude it from the interval. But if high
is exclusive, then setting high
equal to mid
is enough to exclude it.
As for when to update low
vs high
, this depends on what you're searching for. Besides the basic binary search where you just want to know if something exists exactly in the collection, you will have to consider what to do when you are as close as you can get.
In C++ for example, the more useful versions of binary_search
are called lower_bound
and upper_bound
. If the value being searched for doesn't exist in the container, then these both return the same position, namely the first position which is larger than the search value. This is convenient since this is the position you should insert that value if you want to keep the container sorted. However, if the value is in the container, possibly multiple times, then lower_bound
will return the first occurrence of the value, whereas upper_bound
will still return the first position larger than the value (or in other words, a right bound to the location of the values).
To get these different behaviors you update either the low
or high
bound when mid
is equal to the search value. If you want the lower bound, then you want to continue searching the lower half of your search range, so you bring high
down. If you want the high bound, then you bring low
up. In your example, it brings low
up when cnt == mid
, so it will find an upper bound.
As for what to return, it depends on both your search interval and what you're looking for. In your example, the while loop is checking (low < high)
, so low
and high
will be equal when it breaks and it doesn't matter which you use, but even then you may want to return left - 1
or left + 1
depending on the problem. If the while loop is (low <= high)
then when it breaks low == high + 1
, so it will depend on what you're looking for. When in doubt you can always think through an example.
So to put this all to use, here is a version of the solution you mentioned, but using an interval of [low, high] rather than [low, high):
class Solution {
public:
int findDuplicate(vector<int>& nums) {
int low = 1, high = nums.size() - 2;
while (low <= high) {
int mid = low + (high - low) * 0.5;
int cnt = 0;
for (auto a : nums) {
if (a <= mid) ++cnt;
}
if (cnt <= mid) low = mid + 1;
else high = mid - 1;
}
return low;
}
};
PS: The reason I didn't mention the interval (low, high]
or (low, high)
is because it messes with the math around calculating the mid
index. Because int math will round down, you can end up in a situation where mid
is searched again. For example, if low
is 7 and high
is 9, then low + (high - low) * 0.5
will be 8. After updating low to 8 (since it's exclusive you wouldn't add one), low + (high - low) * 0.5
will still be 8 and your loop will never terminate. You can get around this by adding 1 to the part being divided by 2, but generally it's cleaner to go with an interval where low
is inclusive.