For the erase-remove idiom, why is the second parameter necessary which points to the end of the container?
Asked Answered
H

1

17

Consider the following code (taken from cppreference.com, slightly adapted):

#include <algorithm>
#include <string>
#include <iostream>
#include <cctype>

int main()
{
    std::string str1 = "     Text with some   spaces";
    str1.erase(std::remove(str1.begin(), str1.end(), ' '), str1.end());
    std::cout << str1 << '\n';

    return 0;
}

Why is the second parameter to erase neccessary? (I.e. str1.end() in this case.)

Why can't I just supply the iterators which are returned by remove to erase? Why do I have to tell it also about the last element of the container from which to erase?

The pitfall here is that you can also call erase without the second parameter but that produces the wrong result, obviously.

Are there use cases where I would not want to pass the end of the container as a second parameter to erase?

Is omitting the second parameter of erase for the erase-remove idiom always an error or could that be a valid thing to do?

Hooten answered 11/5, 2019 at 14:5 Comment(6)
erase takes a begin and end iterator. remove shuffles (kind of) the items to remove to the end, and returns a "new ending" iterator -- which is passed to the erase function as the start of the range to erase. You could make a helper function that does these two operations for you in one function call.Downtrodden
Consider that you may want to erase a range other than the one identified by remove.Lumbye
One of the reasons to not do both operations in one go is it may be significantly more efficient for some algorithms to do a bunch of removes, maybe also reuse some of the dangling space, before finally erasing the tail end of the container.Downtrodden
@Downtrodden -- re: "before finally erasing the tail end of the container" -- the fundamental reason for separating the operations is that algorithms apply to sequences, and sequences do not have to have an underlying container. That is, there may well not be a container whose tail has to be removed.Greenman
In many languages containers are themselves something that you pass around to algorithms. But not in C++. In C++ the algorithms don't know about containers - they only know about the sequence of elements between two iterators. Mostly those two iterators are from a container (necessarily, the same container). But sometimes they don't represent a container. (Any given container can implement an algorithm as a method which then knows about the container.) And that is why remove() shuffles elements around but doesn't remove them. It doesn't know the container to remove them from.Singletree
Related info: softwareengineering.stackexchange.com/questions/231336/…Downtrodden
G
18

std::remove returns one iterator; it's the new past-the-end iterator for the sequence. But when the sequence is managed by a container, the size of the container hasn't changed; std::remove shuffles the order of the elements in the sequence, but doesn't actually remove any of them.

To get rid of the elements in the container that are not part of the new sequence you call, of course, container.erase(). But the goal is to remove all the extra elements; calling container.erase() with only one iterator tells it to remove that element. To tell container.erase() to erase everything "from here to the end" you have to tell it both where "here" is and where the end is. So that means two iterators.

If it helps, think of the "remove/erase" idiom as two separate steps:

auto new_end = std::remove(str1.begin(), str1.end(), ' ');
str1.erase(new_end, str1.end());
Greenman answered 11/5, 2019 at 14:18 Comment(2)
Hmm, I had a completely wrong impression about what std::remove would be doing. Thanks for the clarification! I am wondering, however, if this behavior of std::remove could apply to any type of container. It appears to me that it would be applicable only to containers which store their elements contiguously. I can not imagine, how this shuffling behavior could be applied to containers like hash sets.Hooten
@j00hi: The issue with std::unordered_set isn't that it stores its elements non-contiguously, but that "moving" an element is meaningless: elements are distinguished by their hashes rather than by their position. (By contrast, std::list, which also stores elements non-contiguously, works fine with std::remove.) Not coincidentally, std::unordered_set<...>::iterator is actually a constant iterator type, so it doesn't meet the requirements of std::remove -- your compiler should reject something like std::remove(hash_set.begin(), hash_set.end(), ' ').Norrisnorrv

© 2022 - 2024 — McMap. All rights reserved.