Why can't I remove a string from a std::set with std::remove_if? [duplicate]
Asked Answered
U

2

11

Possible Duplicate:
remove_if equivalent for std::map

I have a set of strings:

set <wstring> strings;
// ...

I wish to remove strings according to a predicate, e.g.:

std::remove_if ( strings.begin(), strings.end(), []( const wstring &s ) -> bool { return s == L"matching"; });

When I attempt this, I get the following compiler error:

c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\algorithm(1840): error C2678: binary '=' : no operator found which takes a left-hand operand of type 'const std::basic_string<_Elem,_Traits,_Ax>' 

The error appears to suggest that std::string doesn't have a by-value copy constructor ( which would be illegal). Is it somehow bad to use std::remove_if with std::set ? Should I be doing something else instead such as several iterations of set::find() followed by set::erase() ?

Unbend answered 21/6, 2012 at 15:34 Comment(2)
Your simple sample can be replaced by strings.erase(L"matching");. One presumes that your actual predicate isn't as trivial.Trost
@Robᵩ Yes that was just an example, I do require a function object.Unbend
U
21

std::remove_if (or std::erase) works by reassigning the values of the members of the range. It doesn't understand how std::set organizes data, or how to remove a node from its internal tree data structure. Indeed, it's impossible to do so using only references to nodes, without having the set object itself.

The standard algorithms are designed to have transparent (or at least consistently easy-to-remember) computational complexities. A function to selectively remove elements from a set would be O(N log N), due to the need to rebalance the tree, which is no better than a loop calling my_set.remove() . So, the standard doesn't provide it, and that is what you need to write.

On the other hand, a naively hand-coded loop to remove items from a vector one-by-one would be O(N^2), whereas std::remove_if is O(N). So the library does provide a tangible benefit in that case.

A typical loop (C++03 style):

for ( set_t::iterator i = my_set.begin(); i != my_set.end(); ) {
    if ( condition ) {
        my_set.erase( i ++ ); // strict C++03
        // i = my_set.erase( i ); // more modern, typically accepted as C++03
    } else {
        ++ i; // do not include ++ i inside for ( )
    }
}

Edit (4 years later!): i ++ looks suspicious there. What if erase invalidates i before the post-increment operator can update it? This is fine, though, because it's an overloaded operator++ rather than the built-in operator. The function safely updates i in-place and then returns a copy of its original value.

Ulbricht answered 21/6, 2012 at 15:37 Comment(10)
I wonder why they didn't generalise the notion of an "assignable iterator" (for want of a better term), given that there are serveral containers that exhibit this behaviour.Professed
@Professed There is such a notion, but it's not a fix. The problem is that the std::remove_if is specified as being O(N), and an implementation that is not O(N) cannot be called std::remove_if by the letter of the law. You can provide your own implementation in your own namespace. The overload will conflict with the one in std, though. You're better off just writing a loop.Ulbricht
@Rook: In this case the problem is not the iterator, but the value_type of the container. std::remove_if modifies the values through dereferencing the iterator, but the value_type in a std::set is a constant object.Materiel
remove_if should be called shuffle_to_the_front_unless: then it would be easy to see that it can't possibly be used on a set, and that it doesn't make structural changes to any container it's used over. Which, as you say, iterators never do.Thiourea
Ahh, I carelessly assumed that the iterator was returning a constant reference. Having the value type be constant is kinda obvious in retrospect.Professed
You can also do i = my_set.erase(i); like the example here.Hidalgo
The code is fine, but the reason is wrong. The real reason is that in the expression f(a) the evaluation of a is sequenced before the invocation of the function. This has nothing to do with operator overloading.Cauterize
@L.F. The sequence point of f(a) is introduced by overloading and the built-in operation lacks a counterpart.Ulbricht
@Ulbricht [[expr.call]/8]: All side effects of argument evaluations are sequenced before the function is entered (see [intro.execution]). (eel.is/c++draft/expr.call#8.note-1) If an operator function is invoked using operator notation, argument evaluation is sequenced as specified for the built-in operator; see [over.match.oper]. (eel.is/c++draft/expr.call#8.note-2) Am I missing something?Cauterize
@L.F. Again that is only talking about functions, not the absence of any function. It doesn’t say that built-in operators get sequenced like functions. Only the other way around. Also, note that this answer is 10 years old. Things might have changed and they might change still.Ulbricht
P
10

The error message says

no operator found which takes a left-hand operand of type 'const std::basic_string<_Elem,_Traits,_Ax>'

Note the const. The compiler is correct that std::wstring doesn't have an operator= which can be called on a const object.

Why is the string const? The answer is that the values in a std::set are immutable, because values in a set are ordered, and changing a value could change its ordering in the set, invalidating the set.

Why is the compiler trying to copy a value of the set?

std::remove_if (and std::remove) don't actually erase anything (nor can they, because they don't have the container, only iterators). What they do is to copy all values in the range which don't match the criterion to the beginning of the range, and return an iterator to the next element after the matching elements. You are then supposed to manually erase from the returned iterator to the end of the range. Since a set keeps its elements in order, it would be wrong to move any elements around, so remove_if cannot be used on a set (or any other associative container).

In short, you do have to use a loop of std::find_if and set::erase, like so:

template<class V, class P>
void erase_if(std::set<V>& s, P p)
{
  std::set<V>::iterator e = s.begin();
  for (;;)
  {
    e = std::find_if(e, s.end(), p);
    if (e == s.end())
      break;
    e = s.erase(e);
  }
}
Pianola answered 21/6, 2012 at 16:2 Comment(6)
It would actually be possible for a library to provide std::remove_if compatible with std::set, by using the special knowledge that the root node actually lives inside the container object. However, it would not meet the O(N) runtime requirement.Ulbricht
Oops, the end() node, not the root, but you get the idea.Ulbricht
+1, you spelled the reasoning out very well and provided a nice alternative, but I would probably name that function erase_if.Southworth
@Benjamin Lindley Good point, changed.Pianola
@Ulbricht It would also be inconsistent - its behaviour would be very different to remove_if on other containers.Pianola
@Pianola No, the only difference would be the runtime complexity. remove_if is a stable algo so the elements wouldn't be reordered. It would also not assign values of sequence elements, but instead change the sequence itself, but that aspect isn't specified at all by the standard.Ulbricht

© 2022 - 2024 — McMap. All rights reserved.