Why is the worst case complexity of std::unordered_multiset
insert is linear? I understand why is that the case for std::unordered_set
(you have to check that inserted value is not in the set) but for multiset I don't get it. Am I missing something obvious?
The worst case complexity for std::unordered_multiset::insert()
is linear because:
- Unordered associative containers that support non-unique keys are said to support equivalent keys. When iterating these containers, elements with equivalent keys are adjacent to one another in iteration, forming equivalent-key groups.
- Iterator functions require constant amortized time.
For example, consider the case where 5
, 13
, and 13
are inserted into an unordered_multiset
that has 4
buckets, and unordered_multiset::key_eq(5, 13)
returns false
. In this case, unordered_multiset::hash_function(5)
returns different hash codes for both 5
and 13
. Despite having different hash codes, these elements may still be inserted into the same bucket. If the hash function for an integer returns the integer itself, and the bucket index is the result of the hash code modulus the number of buckets, then:
- Element
5
is hashed to5
, and with4
buckets, it is placed in bucket1
. - Element
13
is hashed to13
, and with4
buckets, it is placed into bucket1
as well.
While unordered_set::insert()
checks to prevent duplicates during insertion, unordered_multiset::insert()
identifies where to insert the element for equivalent-key grouping. In the worst case, the bucket contains [5, 13]
when inserting the final 13
, and upon iterating over all elements, the bucket contains [5, 13, 13]
. As iteration over all elements occurs, the complexity is linear in size()
.
It is worth noting that a rehash may occur during unordered_multiset::insert()
, and unordered_multiset::rehash()
is specified as having a complexity with an average case linear in size()
and the worst case is quadratic. During a rehash, all elements in the original hash table are iterated over and inserted into a new hash table. As iteration has a complexity linear in size()
, and as identified above, each insertion has a worse case linear in size()
, the resulting worst case is O(size()*size())
.
© 2022 - 2024 — McMap. All rights reserved.