The standard mandates that all operations on iterators run in amortized constant time: http://www.eel.is/c++draft/iterator.requirements#general-10. The basic idea is that each iterator category only defines operations which can be implemented in amortized time.
Iteration is a common thing to do, and if operator++
on an iterator (I guess that's what you mean by next?) was logN, then traversing a container in a loop would be NlogN. The standard makes this impossible; since operator++
is amortized constant, iterating over any data structure in the standard is always O(N).
However, I dug into the implementation of multiset
on gcc5.4 to at least have one example. Both set
and multiset
are implemented in terms of the same underlying structure, _Rb_tree
. Delving into that structure a bit, it's nodes not only have left and right node pointers, but also a parent node pointer, and an iterator is just a pointer to a node.
Given a node in a binary search tree that includes a pointer to its parent, it's easy to figure out what the next node in the tree is:
- If it has a right child, descend to the right child. Then descend left child as far as you can; that's the next node.
- If it does not have a right child, ascend to your parent, and determine whether the original node is the left or right child of the parent. If the node is the left child of the parent, then the parent is the next node. If the node is the right of the parent, the parent was already processed, so you need to apply the same logic recursively between the parent and its grandparent.
This SO question shows the source code with the core logic: What is the definition of _Rb_tree_increment in bits/stl_tree.h? (it's surprisingly hard to find for some reason).
This does not have constant time, in particular in both 1. and 2. we have loops that either descend or ascend and could take at most log(N) time. However, you can easily convince yourself that the amortized time is constant because as you traverse the tree with this algorithm, each node is touched at most 4 times:
- Once on the way down to its left child.
- Once when it comes back up from the left child and needs to consider itself.
- Once when it descends to its right child.
- Once when ascending from the right child.
In retrospect I would say this is the fairly-obvious choice. Iteration over the whole data structure is a common operation, so the performance is very important. Adding a third pointer to the node is not a trivial amount of space, but it's not the end of the world either; at most it will bloat the data structure from 3 to 4 words (2 pointers + data, which alignment makes 3 at the minimum, vs 3 pointers + data). If you work with ranges, as opposed to two iterators, an alternative would be to maintain a stack and then you don't need the parent pointer, but this only works if you iterate from the very beginning to the end; it wouldn't allow iteration from an iterator in the middle to the end (which is also an important operation for BST's).
O(1)
worst-case iterators, nobody ever does it that way because of the extra cache footprint. – Flavouring