Profiling my cpu-bound code has suggested I that spend a long time checking to see if a container contains completely unique elements. Assuming that I have some large container of unsorted elements (with <
and =
defined), I have two ideas on how this might be done:
The first using a set:
template <class T>
bool is_unique(vector<T> X) {
set<T> Y(X.begin(), X.end());
return X.size() == Y.size();
}
The second looping over the elements:
template <class T>
bool is_unique2(vector<T> X) {
typename vector<T>::iterator i,j;
for(i=X.begin();i!=X.end();++i) {
for(j=i+1;j!=X.end();++j) {
if(*i == *j) return 0;
}
}
return 1;
}
I've tested them the best I can, and from what I can gather from reading the documentation about STL, the answer is (as usual), it depends. I think that in the first case, if all the elements are unique it is very quick, but if there is a large degeneracy the operation seems to take O(N^2) time. For the nested iterator approach the opposite seems to be true, it is lighting fast if X[0]==X[1]
but takes (understandably) O(N^2) time if all the elements are unique.
Is there a better way to do this, perhaps a STL algorithm built for this very purpose? If not, are there any suggestions eek out a bit more efficiency?
is_unique
would be faster if it took asconst vector<T>&
as its argument instead of accepting its argument by value. That way, you avoid making a copy of the vector and then also copying that copy into a set. – Fassettis_unique
(whether implemented bystd::set
orstd::unique
) run in linear time. By keeping the vector sorted, you spread out the work over time, and have to "pay" for some of the work only once per element, rather than taking a big hit by having to re-compute everything each time you callis_unique
. – Fassettset
should work very well, particularly if you test for membership on every insert instead of adding everything first and then checking. Note that standard STL doesn't have any hash-based containers, althoughhash_set
is present as an extension; or you could use Boost's. – Mertiemerton