Can a std::map be efficiently split into two std::maps at an iterator?

Asked 2/3, 2021 at 23:40 Answered 3/3, 2021 at 0:5

Let's say I have a std::map<int, std::string> myMap containing the data

1. Red
2. Blue
3. Green
5. Fuchsia
6. Mauve
9. Gamboge
10. Vermillion

and also an std::map<int, std::string>::iterator it pointing at the element

5. Fuchsia

I would like to do something like (making this up)

std::map<int, std::string> myHead = eject(myMap, myMap.begin(), it);

which would result in myMap containing

5. Fuchsia
6. Mauve
9. Gamboge
10. Vermillion

and myHead containing

1. Red
2. Blue
3. Green

I could accomplish this by doing something like

std::map<int, std::string> myHead;
myHead.insert(myMap.begin(), it);
myMap.erase(myMap.begin(), it);

but this seems suboptimal in at least some cases, e.g. if I pick a point such that I'm just splitting off a subtree. (I'll admit that I haven't actually thought through the actual details of the algorithmic complexity here, but if we imagine a case where the value type is extraordinarily expensive to copy then it's clear that the above can't be optimal in general.)

Question: is there a way that I can get std::map to perform this operation in an optimal manner, or do I have to write my own binary search tree where I have access to the internals to accomplish this?

Underscore answered 2/3, 2021 at 23:40 Comment(1)

@ildjarn: I am aware of those but I don't see a way to use them to accomplish what I'm describing here. extract operates on a single node (without bringing along the nodes underneath it), so it seems like any solution based on that would necessarily be at least O(m), where m is the number of elements in the piece being split off. And I don't see a use for merge at all here. – Underscore 3/3, 2021 at 0:26

If we're speaking asymptotic complexity, you can do this in O(log n) time for most self-balancing tree types, using two operations colloquially known as split and join. There's an extensive Wikipedia article on this.

You can not get this complexity using std::map, you'll need to roll your own or a third-party self-balancing tree implementation. If you need to do this operation often, this is well worth it. The best you can get using the standard library is O(n), which can be many orders of magnitude slower.

You can do it in O(n) in C++11 as:

template<class K, class T, class C, class A>
std::map<K, T, C, A> eject(
    std::map<K, T, C, A>& my_map,
    std::map<K, T, C, A>::iterator begin,
    std::map<K, T, C, A>::iterator end,
) {
    std::map<K, T, C, A> result;
    while (begin != end) {
        auto next = std::next(begin);
        // C++11
        result.insert(result.end(), std::move(*begin));
        my_map.erase(begin);
        // C++17 (avoids move and destruct)
        // result.insert(result.end(), my_map.extract(begin));
        begin = next;
    }
    return result;
}

Pacifistic answered 3/3, 2021 at 0:5 Comment(8)

Thanks, this is helpful. Are there any well-known third-party BSTs that implement this functionality? – Underscore 3/3, 2021 at 0:27

@DanielMcLaury I don't know, sorry. – Pacifistic 3/3, 2021 at 0:30

No, the std library can split a map in O(size of smaller map), not n lg n. – Adaptive 3/3, 2021 at 1:44

Just extract and insert in order at end. Both are amortized constant time operations, on size of smaller container nodes. – Adaptive 3/3, 2021 at 2:34

@Yakk-AdamNevraumont Fair enough, I'll edit the answer to O(n), although that still can be many orders of magnitude slower. – Pacifistic 3/3, 2021 at 11:6

@Pacifistic You used a danging iterator. Also it needlessly copied keys in c++17 and did allocations. Both fixed. :) – Adaptive 3/3, 2021 at 14:31

@Yakk-AdamNevraumont I rolled back your change. 1. I don't know what a dangling iterator is, but I can assure you begin is still a valid iterator after std::move(*begin). 2. No, my version was O(n), both erase and insert with a correct hint are amortized O(1). – Pacifistic 3/3, 2021 at 14:34

@Yakk-AdamNevraumont Ah I misunderstood, erase invalidates the iterator, you are right, will fix. The complexity was correct either way, although the C++17 method avoids a move/destruct, so it is preferable. – Pacifistic 3/3, 2021 at 14:40

There is extract, but it operates on nodes not node ranges.

And you can efficiently combine maps.

But there is no efficient (faster than O(n)) range based extract.

Adaptive answered 2/3, 2021 at 23:58 Comment(0)

You can use move iterators, as follows

int main(){
    auto my_map = std::map<int, std::string>{
        {1, "Read"s},
        {2, "Blue"s},
        {3, "Green"s},
        {5, "Fuchsia"s},
        {6, "Mauve"s},
        { 9, "Gamboge"s },
        {10, "Vermillion"s}
    };
    auto it = my_map.begin();
    std::advance(it, 3);
    auto head_map = std::map{
        std::make_move_iterator(my_map.begin()),
        std::make_move_iterator(it)
    };
    auto tail_map = std::map{
        std::make_move_iterator(it),
        std::make_move_iterator(my_map.end())
    };
    std::cout << "The Head\n";
    for (auto [key, value]: head_map){
        std::cout << key << ":" << value << " ";
    }
    std::cout << "\n\nThe Tail\n";
    for (auto [key, value]: tail_map){
        std::cout << key << ":" << value << " ";
    }
}

Demo

Yeorgi answered 3/3, 2021 at 0:0 Comment(3)

So, if I understand correctly, this addresses the issue of any potential costs associated to moving the value type, but when we call the constructor to build tail_map we're just inserting each element of the tail sequentially? – Underscore 3/3, 2021 at 0:23

@DanielMcLaury No. I think the begin and it iterators are assigned to begin and end iterators of the new map only. – Yeorgi 3/3, 2021 at 0:30

@DanielMcLaury Yeah. u r right I think u need to implement yours. – Yeorgi 3/3, 2021 at 0:46

Recommended topics

Hot tags