I was reading this question. The selected answer contains the following two algorithms. I couldn't understand why the first one's time complexity is O(ln(n)). At the worst case, if the array don't contain any duplicates it will loop n times so does the second one. Am I wrong or am I missing something? Thank you
1) A faster (in the limit) way
Here's a hash based approach. You gotta pay for the autoboxing, but it's O(ln(n)) instead of O(n2). An enterprising soul would go find a primitive int-based hash set (Apache or Google Collections has such a thing, methinks.)
boolean duplicates(final int[] zipcodelist)
{
Set<Integer> lump = new HashSet<Integer>();
for (int i : zipcodelist)
{
if (lump.contains(i)) return true;
lump.add(i);
}
return false;
}
2)Bow to HuyLe
See HuyLe's answer for a more or less O(n) solution, which I think needs a couple of add'l steps:
static boolean duplicates(final int[] zipcodelist) {
final int MAXZIP = 99999;
boolean[] bitmap = new boolean[MAXZIP+1];
java.util.Arrays.fill(bitmap, false);
for (int item : zipcodeList)
if (!bitmap[item]) bitmap[item] = true;
else return true;
}
return false;
}
O(n*log(n))
, which is what you'd get from e.g. a set backed by a balanced binary search tree. It's pretty obvious that there can't be a way faster thanO(n)
. – RecognizeeO(log(size))
, so if you don
lookups and inserts, you haveO(n*log n)
operations. I guess that's where the log comes from. – RecognizeeO(n*log n)
. For a HashSet, it'sO(n)
if the HashSet keeps theO(1)
insert and lookup promise. (It can degrade toO(n^2)
for bad implementations of a HashSet, but I'm optimistic that it won't forjava.util.HashSet
.) My guess is that the author of the answer to the linked question thought of a tree set first and typo'edO(n*log n)
asO(log n)
and later decided to use a HashSet, forgetting to change the complexity. Or originally had the right complexity and miscorrected. – Recognizee