Guava's ImmutableSet
seems to perform quite poorly in my benchmark concerning contains
. For some sizes it gets even much slower than List
:
size benchmark ns linear runtime
100000 ListContains 110279.54 ==
100000 SetContains 7.15 =
100000 ImmutableSetContains 76716.47 =
200000 ListContains 275367.66 =====
200000 SetContains 7.34 =
200000 ImmutableSetContains 322185.50 ======
500000 ListContains 935210.10 ====================
500000 SetContains 7.79 =
500000 ImmutableSetContains 1382765.76 ==============================
Basically, I fill a set with a few thousands negative integers and test contains with non-negative ones. The code is trivial but a bit too long for pasting in a small text area, so please look here.
I wonder what's going on here. Probably, I hit some degenerate case, although I obviously didn't try to. Or maybe I've just blew the benchmark. Otherwise, I wonder if it can and should be fixed.
The solution was to change the smearing function by replacing
hashCode ^= (hashCode >>> 20) ^ (hashCode >>> 12);
return hashCode ^ (hashCode >>> 7) ^ (hashCode >>> 4);
by
return C2 * Integer.rotateLeft(hashCode * C1, 15);
This takes about the same time and might have some disadvantage, but solves the current problem by nicely spreading the hashes.