Using Binary Search with sorted Array with duplicates [duplicate]

B

10

33

I've been tasked with creating a method that will print all the indices where value x is found in a sorted array.

I understand that if we just scanned through the array from 0 to N (length of array) it would have a running time of O(n) worst case. Since the array that will be passed into the method will be sorted, I'm assuming that I can take advantage of using a Binary Search since this will be O(log n). However, this only works if the array has unique values. Since the Binary Search will finish after the first "find" of a particular value. I was thinking of doing a Binary Search for finding x in the sorted array, and then checking all values before and after this index, but then if the array contained all x values, it doesn't seem like it would be that much better.

I guess what I'm asking is, is there a better way to find all the indices for a particular value in a sorted array that is better than O(n)?

public void PrintIndicesForValue42(int[] sortedArrayOfInts)
{
    // search through the sortedArrayOfInts

    // print all indices where we find the number 42. 
}

Ex: sortedArray = { 1, 13, 42, 42, 42, 77, 78 } would print: "42 was found at Indices: 2, 3, 4"

Bianca answered 2/11, 2012 at 14:48 Comment(2)

your solution sounds good, if the array contained all x values you would have to look at all of them anyway, no matter what – Clad 2/11, 2012 at 14:52

@JonSkeet - Sorry about that typo. I've updated the array to a sorted one. – Bianca 2/11, 2012 at 15:3

C

29

Well, if you actually do have a sorted array, you can do a binary search until you find one of the indexes you're looking for, and from there, the rest should be easy to find since they're all next to each-other.

once you've found your first one, than you go find all the instances before it, and then all the instances after it.

Using that method you should get roughly O(lg(n)+k) where k is the number of occurrences of the value that you're searching for.

EDIT:

And, No, you will never be able to access all k values in anything less than O(k) time.

Second edit: so that I can feel as though I'm actually contributing something useful:

Instead of just searching for the first and last occurrences of X than you can do a binary search for the first occurence and a binary search for the last occurrence. which will result in O(lg(n)) total. once you've done that, you'll know that all the between indexes also contain X(assuming that it's sorted)

You can do this by searching checking if the value is equal to x , AND checking if the value to the left(or right depending on whether you're looking for the first occurrence or the last occurrence) is equal to x.

Chenay answered 2/11, 2012 at 14:52 Comment(5)

That solution is already stated in the question, right? – Subordinate 2/11, 2012 at 14:55

@Subordinate No, the solution he proposes in the question is a linear scan starting from index 0 until he finds the last index after the value he's looking for, which is O(n) and is a linear search, not a binary search. The solution in this answer is a binary search, followed by a linear scan, but the linear scan only happens on a subsection of the array. – Rowland 2/11, 2012 at 14:57

@Rowland from the question: "I was thinking of doing a Binary Search for finding x in the sorted array, and then checking all values before and after this index, but then if the array contained all x values, it doesn't seem like it would be that much better." -- sounds like exactly what you posted. – Subordinate 2/11, 2012 at 15:0

@Subordinate actually after second reading yes, he did have that in the question. I will now edit to reflect that. – Chenay 2/11, 2012 at 15:1

+1 for being able to do it in O(lg(n)). i should've given it more thought. – Mcquade 12/11, 2012 at 14:48

Y

47

You will get the result in O(lg n)

public static void PrintIndicesForValue(int[] numbers, int target) {
    if (numbers == null)
        return;

    int low = 0, high = numbers.length - 1;
    // get the start index of target number
    int startIndex = -1;
    while (low <= high) {
        int mid = (high - low) / 2 + low;
        if (numbers[mid] > target) {
            high = mid - 1;
        } else if (numbers[mid] == target) {
            startIndex = mid;
            high = mid - 1;
        } else
            low = mid + 1;
    }

    // get the end index of target number
    int endIndex = -1;
    low = 0;
    high = numbers.length - 1;
    while (low <= high) {
        int mid = (high - low) / 2 + low;
        if (numbers[mid] > target) {
            high = mid - 1;
        } else if (numbers[mid] == target) {
            endIndex = mid;
            low = mid + 1;
        } else
            low = mid + 1;
    }

    if (startIndex != -1 && endIndex != -1){
        for(int i=0; i+startIndex<=endIndex;i++){
            if(i>0)
                System.out.print(',');
            System.out.print(i+startIndex);
        }
    }
}

Yuan answered 3/12, 2013 at 18:1 Comment(0)

C

29

Well, if you actually do have a sorted array, you can do a binary search until you find one of the indexes you're looking for, and from there, the rest should be easy to find since they're all next to each-other.

once you've found your first one, than you go find all the instances before it, and then all the instances after it.

Using that method you should get roughly O(lg(n)+k) where k is the number of occurrences of the value that you're searching for.

EDIT:

And, No, you will never be able to access all k values in anything less than O(k) time.

Second edit: so that I can feel as though I'm actually contributing something useful:

Instead of just searching for the first and last occurrences of X than you can do a binary search for the first occurence and a binary search for the last occurrence. which will result in O(lg(n)) total. once you've done that, you'll know that all the between indexes also contain X(assuming that it's sorted)

You can do this by searching checking if the value is equal to x , AND checking if the value to the left(or right depending on whether you're looking for the first occurrence or the last occurrence) is equal to x.

Chenay answered 2/11, 2012 at 14:52 Comment(5)

That solution is already stated in the question, right? – Subordinate 2/11, 2012 at 14:55

@Subordinate No, the solution he proposes in the question is a linear scan starting from index 0 until he finds the last index after the value he's looking for, which is O(n) and is a linear search, not a binary search. The solution in this answer is a binary search, followed by a linear scan, but the linear scan only happens on a subsection of the array. – Rowland 2/11, 2012 at 14:57

@Rowland from the question: "I was thinking of doing a Binary Search for finding x in the sorted array, and then checking all values before and after this index, but then if the array contained all x values, it doesn't seem like it would be that much better." -- sounds like exactly what you posted. – Subordinate 2/11, 2012 at 15:0

@Subordinate actually after second reading yes, he did have that in the question. I will now edit to reflect that. – Chenay 2/11, 2012 at 15:1

+1 for being able to do it in O(lg(n)). i should've given it more thought. – Mcquade 12/11, 2012 at 14:48

M

3

public void PrintIndicesForValue42(int[] sortedArrayOfInts) {
    int index_occurrence_of_42 = left = right = binarySearch(sortedArrayOfInts, 42);
    while (left - 1 >= 0) {
        if (sortedArrayOfInts[left-1] == 42)
            left--;
    }
    while (right + 1 < sortedArrayOfInts.length) {
        if (sortedArrayOfInts[right+1] == 42)
            right++;
    }
    System.out.println("Indices are from: " + left + " to " + right);
}

This would run in O(log(n) + #occurrences) Read and understand the code. It's simple enough.

Mcquade answered 2/11, 2012 at 15:46 Comment(2)

I'm assuming that this would be O(log n + n) = O(n) if every element in the array was 42. However, this would be a very limited worst case. With that said it would be safe to assume that on a more "average" case it would be O(log n + k), where k is some constant number of occurrences which would be O(log n)? Just wondering, because this was what I had originally planned, but since it's based on a variable number of possible duplicates I was curious if there could be an algorithm that guarantees something better than O(n). But looking at the answers, it doesn't seem like it. – Bianca 2/11, 2012 at 16:27

You're not breaking out of those while loops, so in case you found the first non 42 number, the loop doesn't stop. – Humming 14/6, 2014 at 6:45

O

3

Another result for log(n) binary search for leftmost target and rightmost target. This is in C++, but I think it is quite readable.

The idea is that we always end up when left = right + 1. So, to find leftmost target, if we can move right to rightmost number which is less than target, left will be at the leftmost target.

For leftmost target:

int binary_search(vector<int>& nums, int target){
    int n = nums.size();
    int left = 0, right = n - 1;

    // carry right to the greatest number which is less than target.
    while(left <= right){
        int mid = (left + right) / 2;
        if(nums[mid] < target)
            left = mid + 1;
        else
            right = mid - 1;
    }
    // when we are here, right is at the index of greatest number
    // which is less than target and since left is at the next, 
    // it is at the first target's index
    return left;
}

For the rightmost target, the idea is very similar:

int binary_search(vector<int>& nums, int target){
    while(left <= right){
        int mid = (left + right) / 2;
        // carry left to the smallest number which is greater than target.
        if(nums[mid] <= target)
            left = mid + 1;
        else
            right = mid - 1;
    }
    // when we are here, left is at the index of smallest number
    // which is greater than target and since right is at the next, 
    // it is at the first target's index
    return right;
}

Overflow answered 26/10, 2017 at 8:52 Comment(0)

S

2

Below is the java code which returns the range for which the search-key is spread in the given sorted array:

public static int doBinarySearchRec(int[] array, int start, int end, int n) {
    if (start > end) {
        return -1;
    }
    int mid = start + (end - start) / 2;

    if (n == array[mid]) {
        return mid;
    } else if (n < array[mid]) {
        return doBinarySearchRec(array, start, mid - 1, n);
    } else {
        return doBinarySearchRec(array, mid + 1, end, n);
    }
}

/**
 * Given a sorted array with duplicates and a number, find the range in the
 * form of (startIndex, endIndex) of that number. For example,
 * 
 * find_range({0 2 3 3 3 10 10}, 3) should return (2,4). find_range({0 2 3 3
 * 3 10 10}, 6) should return (-1,-1). The array and the number of
 * duplicates can be large.
 * 
 */
public static int[] binarySearchArrayWithDup(int[] array, int n) {

    if (null == array) {
        return null;
    }
    int firstMatch = doBinarySearchRec(array, 0, array.length - 1, n);
    int[] resultArray = { -1, -1 };
    if (firstMatch == -1) {
        return resultArray;
    }
    int leftMost = firstMatch;
    int rightMost = firstMatch;

    for (int result = doBinarySearchRec(array, 0, leftMost - 1, n); result != -1;) {
        leftMost = result;
        result = doBinarySearchRec(array, 0, leftMost - 1, n);
    }

    for (int result = doBinarySearchRec(array, rightMost + 1, array.length - 1, n); result != -1;) {
        rightMost = result;
        result = doBinarySearchRec(array, rightMost + 1, array.length - 1, n);
    }

    resultArray[0] = leftMost;
    resultArray[1] = rightMost;

    return resultArray;
}

Suttee answered 2/11, 2014 at 18:54 Comment(0)

W

2

I came up with the solution using binary search, only thing is to do the binary search on both the sides if the match is found.

public static void main(String[] args) {
    int a[] ={1,2,2,5,5,6,8,9,10};
    System.out.println(2+" IS AVAILABLE  AT = "+findDuplicateOfN(a, 0, a.length-1, 2));
    System.out.println(5+" IS AVAILABLE  AT = "+findDuplicateOfN(a, 0, a.length-1, 5));
    int a1[] ={2,2,2,2,2,2,2,2,2};
    System.out.println(2+" IS AVAILABLE  AT = "+findDuplicateOfN(a1, 0, a1.length-1, 2));

    int a2[] ={1,2,3,4,5,6,7,8,9};
    System.out.println(10+" IS AVAILABLE  AT = "+findDuplicateOfN(a2, 0, a2.length-1, 10));
}

public static String findDuplicateOfN(int[] a, int l, int h, int x){
    if(l>h){
        return "";
    }
    int m = (h-l)/2+l;
    if(a[m] == x){
        String matchedIndexs = ""+m;
        matchedIndexs = matchedIndexs+findDuplicateOfN(a, l, m-1, x);
        matchedIndexs = matchedIndexs+findDuplicateOfN(a, m+1, h, x);
        return matchedIndexs;
    }else if(a[m]>x){
        return findDuplicateOfN(a, l, m-1, x);
    }else{
        return findDuplicateOfN(a, m+1, h, x);
    }
}


2 IS AVAILABLE  AT = 12 
5 IS AVAILABLE  AT = 43 
2 IS AVAILABLE  AT = 410236578 
10 IS AVAILABLE  AT =

I think this is still providing the results in O(logn) complexity.

Washedup answered 7/12, 2017 at 5:20 Comment(1)

In the worst case scenario of all duplicates, e.g. [5, 5, 5, 5, 5, 5, 5, 5], does this visit every member of the array and turn in to O(n)? – Yadirayaeger 12/2, 2020 at 14:24

A

1

A Hashmap might work, if you're not required to use a binary search.

Create a HashMap where the Key is the value itself, and then value is an array of indices where that value is in the array. Loop through your array, updating each array in the HashMap for each value.

Lookup time for the indices for each value will be ~ O(1), and creating the map itself will be ~ O(n).

Araucania answered 2/11, 2012 at 14:53 Comment(6)

would work, but sounds like overkill if you already have a sorted array. – Clad 2/11, 2012 at 14:54

True. SamIam's solution is still better, as long as it's sorted. – Araucania 2/11, 2012 at 14:55

@Clad how does sorting the array help with finding duplicates? – Seedman 2/11, 2012 at 15:15

@Seedman when the array is sorted, all duplicates are next to each other ;) So if you find one, you only have to look left and right to find all of the other ones. – Clad 2/11, 2012 at 15:19

@Clad but how do you know which value to look for? if you initially know the duplicate value, yea, you find one and look left and right until you hit another value. – Seedman 2/11, 2012 at 15:27

look at OP's question. He's not looking for duplicates. But if the number he's looking for has multiple occurences he wants to find all of them (their index respectively) – Clad 2/11, 2012 at 15:30

B

1

Find_Key(int arr[], int size, int key){
int begin = 0;
int end = size - 1;
int mid = end / 2;
int res = INT_MIN;

while (begin != mid)
{
    if (arr[mid] < key)
        begin = mid;
    else
    {
        end = mid;
        if(arr[mid] == key)
            res = mid;
    }
    mid = (end + begin )/2;
}
return res;
}

Assuming the array of ints is in ascending sorted order; Returns the index of the first index of key occurrence or INT_MIN. Runs in O(lg n).

Bogtrotter answered 6/2, 2014 at 0:44 Comment(1)

This should work only if begin starts from -1 (not 0) and end from size (not size - 1). Also you will have to take care of empty arrays. – Humming 16/6, 2014 at 14:56

H

1

It is using Modified Binary Search. It will be O(LogN). Space complexity will be O(1). We are calling BinarySearchModified two times. One for finding start index of element and another for finding end index of element.

private static int BinarySearchModified(int[] input, double toSearch)
    {
        int start = 0;
        int end = input.Length - 1;

        while (start <= end)
        {
            int mid = start + (end - start)/2;
            if (toSearch < input[mid]) end = mid - 1;
            else start = mid + 1;
        }

        return start;
    }


    public static Result GetRange(int[] input, int toSearch)
    {
        if (input == null) return new Result(-1, -1);

        int low = BinarySearchModified(input, toSearch - 0.5);

        if ((low >= input.Length) || (input[low] != toSearch)) return new Result(-1, -1);

        int high = BinarySearchModified(input, toSearch + 0.5);

        return new Result(low, high - 1);
    } 

 public struct Result
    {
        public int LowIndex;
        public int HighIndex;

        public Result(int low, int high)
        {
            LowIndex = low;
            HighIndex = high;
        }
    }

Heave answered 28/1, 2016 at 6:18 Comment(0)

S

0

public void printCopies(int[] array)
{
    HashMap<Integer, Integer> memberMap = new HashMap<Integer, Integer>();
    for(int i = 0; i < array.size; i++)
       if(!memberMap.contains(array[i]))
           memberMap.put(array[i], 1);
       else
       {
           int temp = memberMap.get(array[i]); //get the number of occurances
           memberMap.put(array[i], ++temp); //increment his occurance
       }

    //check keys which occured more than once
    //dump them in a ArrayList
    //return this ArrayList
 }

Alternatevely, instead of counting the number of occurances, you can put their indices in a arraylist and put that in the map instead of the count.

   HashMap<Integer, ArrayList<Integer>> 
   //the integer is the value, the arraylist a list of their indices

public void printCopies(int[] array)
{
    HashMap<Integer, ArrayList<Integer>> memberMap = new HashMap<Integer, ArrayList<Integer>>();
    for(int i = 0; i < array.size; i++)
       if(!memberMap.contains(array[i]))
       {
           ArrayList temp = new ArrayList();
           temp.add(i);
           memberMap.put(array[i], temp);
       }
       else
       {
           ArrayList temp = memberMap.get(array[i]); //get the lsit of indices
           temp.add(i);
           memberMap.put(array[i], temp); //update the index list
       }

    //check keys which return lists with length > 1
    //handle the result any way you want
 }

heh, i guess this will have to be posted.

 int predefinedDuplicate = //value here;
 int index = Arrays.binarySearch(array, predefinedDuplicate);
 int leftIndex, rightIndex;
 //search left
 for(leftIndex = index; array[leftIndex] == array[index]; leftIndex--); //let it run thru it
 //leftIndex is now the first different element to the left of this duplicate number string
 for(rightIndex = index; array[rightIndex] == array[index]; rightIndex++); //let it run thru it

 //right index contains the first different element to the right of the string
 //you can arraycopy this [leftIndex+1, rightIndex-1] string or just print it
 for(int i = leftIndex+1; i<rightIndex; i++)
 System.out.println(array[i] + "\t");

Seedman answered 2/11, 2012 at 14:58 Comment(14)

This will get you the number of occurrences of the number, not the locations of the number, which is what he wants. – Araucania 2/11, 2012 at 15:0

@Araucania look at the edit. feel free to remove the downvote whenever ;) – Seedman 2/11, 2012 at 15:1

There are still a TON of problems with your code. You have the basic idea right, but I already suggested this a while ago in my answer. – Araucania 2/11, 2012 at 15:6

@Araucania a "TON" ? please point out a few, I see nothing wrong here.... :/ It will map each value to a list of it's indices. Finding occurances of 'x' is simply return memberMap.get(x); and printing it. – Seedman 2/11, 2012 at 15:14

it's a void method working with local variables. Any call to it will have no effect... – Clad 2/11, 2012 at 15:21

@Clad i hope you're joking... – Seedman 2/11, 2012 at 15:26

no, i am not joking, just reading your code – Clad 2/11, 2012 at 15:27

What jlordo said, and more. This isn't really a problem but the entire top half of your answer is irrelevant, since he never said he wanted anything to do with any occurrences of the values, so I'm only going to look at the bottom one. You never gave your HashMap<Integer, ArrayList> a name, so this wouldn't even compile. memberMap seems to be leftover from your previous code snippet and you have yet to remove it. It is useless. – Araucania 2/11, 2012 at 15:27

When you create your ArrayList inside the for loop you never give it a type. It should be an ArrayList that contains Integers. – Araucania 2/11, 2012 at 15:29

memberMap is of type HashMap<Integer, Integer> and you are using ArrayList as values. Won't work. – Clad 2/11, 2012 at 15:32

From your third piece of code the condition of the for-loop should probably be array[leftIndex] == array[index] or else it would be always false – Clad 2/11, 2012 at 15:35

now you're just nitpicking it without concern that it's a demonstration of an idea and not just copy/pastable production-ready code. But I agree - your code is much better. – Seedman 2/11, 2012 at 15:35

clever way of adjusting left and right index, mine would have looked more complicated ;) – Clad 2/11, 2012 at 15:38

somehow i expected a "predefinedDuplicate was never assigned a value, this will never compile" kinda comment :P j/k. thanks. – Seedman 2/11, 2012 at 15:43

Recommended topics

Hot tags