Why rehash has quadratic complexity, but operator [] has linear complexity in worst case?
Asked Answered
H

1

15

I know about this question, but mine is a bit different.

Why does rehash have quadratic complexity, but operator [] (which can call rehash) has linear complexity in worst case?

Sorry, but I don't have 10 points for adding images, so if by the time you look at this question everything was already fixed, here is what I saw at cppreference:

rehash complexity:

Average case linear in the size of the container, worst case quadratic.

operator[] complexity:

Average case: constant, worst case: linear in size.

I know why rehash can have quadratic complexity. However, it is not difficult to make it linear. Therefore, either statement can be true, but not both together (only if different sizes are meant, but then I don't understand what can be taken as a size, except for the number of elements).

Hep answered 9/2, 2024 at 18:40 Comment(1)
Comments have been moved to chat; please do not continue the discussion here. Before posting a comment below this one, please review the purposes of comments. Comments that do not request clarification or suggest improvements usually belong as an answer, on Meta Stack Overflow, or in Stack Overflow Chat. Comments continuing discussion may be removed.Yaelyager
N
2

I believe that this is a (minor) inconsistency in the current C++ standard that could merit consideration through a LWG issue. There seem to be two plausible approaches to address this discrepancy:

  1. Maintain the loose quadratic worst-case complexity constraint of the rehash function and revise the wording concerning the complexity of emplace (and similar functions like insert). This revision could resemble the following:

    Complexity: Average case O(1), worst case O(a_uniq.size()) if no rehashing occurs; otherwise O(a_uniq.size()2).

  2. Retain the strict linear worst-case complexity constraint for insertion functions and update the wording of rehash (and potentially related constructors and assignment operators) to:

    Complexity: Average case linear in a.size(), worst case quadratic.

Personally, I favor the second approach. It not only provides clarity in wording but also accurately reflects the efficiency of unordered associative containers in the standard library. Achieving linear worst-case complexity is relatively feasible (especially within libstdc++), and I anticipate library vendors opting for the most efficient approach. This rationale is consistent with the decision to update the worst-case complexity of std::sort from O(N2) to O(NlogN) in LWG713.


An aside: The original hash table proposal actually proposed linear complexity, whereas the final version adopted in C++11 opted for quadratic complexity.

Narah answered 10/4, 2024 at 16:41 Comment(2)
I'm also of the second opinion. But "an aside" is confusingHep
@bash These papers are written by various authors, so their preferences on this matter may differ.Narah

© 2022 - 2025 — McMap. All rights reserved.