Java collection binarySearch not working properly
Asked Answered
P

4

5

I am just trying to use native Java binarySearch hoping it can always find the first occurrence. But it's not always return the first occurrence, what have I done wrong here ?

import java.util.*;

class BinarySearchWithComparator
{
  public static void main(String[] args)
  {
    // Please scroll down to see 'User' class implementation.
    List<User> l = new ArrayList<User>();
    l.add(new User(10, "A"));
    l.add(new User(10, "A"));
    l.add(new User(10, "A"));

    l.add(new User(20, "B"));
    l.add(new User(20, "B"));
    l.add(new User(20, "B"));
    l.add(new User(20, "B"));
    l.add(new User(20, "B"));
    l.add(new User(20, "B"));
    l.add(new User(20, "B"));


    l.add(new User(30, "C"));
    l.add(new User(30, "C"));
    l.add(new User(30, "C"));
    l.add(new User(30, "C"));

    Comparator<User> c = new Comparator<User>() {
      public int compare(User u1, User u2) {
        return u1.getId().compareTo(u2.getId());
      }
    };

    // Must pass in an object of type 'User' as the key.
    // The key is an 'User' with the 'id' which is been searched for.
    // The 'name' field is not used in the comparison for the binary search,
    // so it can be a dummy value -- here it is omitted with a null.
    //
    // Also note that the List must be sorted before running binarySearch,
    // in this case, the list is already sorted.
    Collections.sort(l,c);

    int index = Collections.binarySearch(l, new User(10, null), c);
    System.out.println(index);

    index = Collections.binarySearch(l, new User(20, null), c);
    System.out.println(index);

    index = Collections.binarySearch(l, new User(30, null), c);
    System.out.println(index);

    index = Collections.binarySearch(l, new User(42, null), c);
    System.out.println(index);
  }
}

class User {
  private int id;
  private String name;

  public User(int id, String name) {
    this.id = id;
    this.name = name;
  }

  public Integer getId() {
    return Integer.valueOf(id);
  }
}

========

Result:
2 //not 0 ?
6 //not 3?
10 //ok
-15  ok

Why first 10 is not index 0 ?
Why first 20 is not index 3 ?
OK , 30 first occurrence is index 10

======================================

Update

Now it seems it's the API does not not gaurante that ! Can anyone give me working example of how to find the 1st occurrence and last occurrence for a given element (say User(10,null)?

Thanks a lot.

Phalan answered 25/3, 2013 at 1:16 Comment(0)
T
10

Java does not make a guarantee as to which element among the equal ones it is going to return. When you have multiple elements in an equal range, you need to walk the list down to find the first element that's equal to what you are looking for, like this:

User lookFor = new User(30, null);
index = Collections.binarySearch(l, lookFor, c);
while (index > 0 && c.compare(lookFor, l[index-1]) == 0) {
    index--;
}
// At this point the index is at the first element of the equal range

You can wrap this logic in a static function to avoid writing an explicit loop each time.

Note that if your collection is random access, this will cause worst-case performance (when there are many identical elements) to degrade from O(lg(n)) to O(n).

Tannic answered 25/3, 2013 at 1:23 Comment(2)
this is working, but is there a better way of doing this in java. In C++, std::equal_range, lower_bound, upper_bound does the job nicely...Phalan
@Gob00st I would wrap this in a static method similar to what you've got in STL.Tannic
J
5

but you have more than 1 element with id 10 on your list. So binarySearch isn't wrong

Java manual for binarySearch says this:

Searches the specified list for the specified object using the binary search algorithm. The list must be sorted into ascending order according to the natural ordering of its elements (as by the sort(List) method) prior to making this call. If it is not sorted, the results are undefined. If the list contains multiple elements equal to the specified object, there is no guarantee which one will be found.

http://docs.oracle.com/javase/6/docs/api/java/util/Collections.html#binarySearch(java.util.List, T)

Jaqitsch answered 25/3, 2013 at 1:19 Comment(0)
A
1

from the javadoc on Array.binarySearch():

Searches a range of the specified array of bytes for the specified value using the
binary search algorithm. The range must be sorted (as by the sort(byte[], int, int) 
method) prior to making this call. If it is not sorted, the results are undefined. 
If the range contains multiple elements with the specified value, there is no 
guarantee which one will be found.
Acetous answered 25/3, 2013 at 1:21 Comment(0)
H
1

Many of the items in your list are equal, for sorting purposes. If two items have the same ID, then it doesn't really matter which one you use, from a sorting perspective. Collections.binarySearch just splits the list in half until it finds a match. Once it finds a match, it is not going to check the previous item to see if it's also a match, it will just return the index.

Consider a more robust compareTo implementation, which sorts by more than just the ID, if items will have IDs.

Hesperides answered 25/3, 2013 at 1:21 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.