While this answer by Peter G. in variant one (the swap-and-pop technique) is the fastest when you do not need to preserve the order, here is the unmentioned alternative which maintains the order.
With C++17 and C++20 the removal of multiple elements from a vector is possible with standard algorithms. The run time is O(N * Log(N)) due to std::stable_partition
. There are no external helper arrays, no excessive copying, everything is done inplace. Code is a "one-liner":
template <class T>
inline void erase_selected(std::vector<T>& v, const std::vector<int>& selection)
{
v.resize(std::distance(
v.begin(),
std::stable_partition(v.begin(), v.end(),
[&selection, &v](const T& item) {
return !std::binary_search(
selection.begin(),
selection.end(),
static_cast<int>(static_cast<const T*>(&item) - &v[0]));
})));
}
The code above assumes that selection
vector is sorted (if it is not the case, std::sort
over it does the job, obviously).
To break this down, let us declare a number of temporaries:
// We need an explicit item index of an element
// to see if it should be in the output or not
int itemIndex = 0;
// The checker lambda returns `true` if the element is in `selection`
auto filter = [&itemIndex, &sorted_sel](const T& item) {
return !std::binary_search(
selection.begin(),
selection.end(),
itemIndex++);
};
This checker lambda is then fed to std::stable_partition
algorithm which is guaranteed to call this lambda only once for each element in the original (unpermuted !) array v
.
auto end_of_selected = std::stable_partition(
v.begin(),
v.end(),
filter);
The end_of_selected
iterator points right after the last element which should remain in the output array, so we now can resize v
down. To calculate the number of elements we use the std::distance
to get size_t
from two iterators.
v.resize(std::distance(v.begin(), end_of_selected));
This is different from the code at the top (it uses itemIndex
to keep track of the array element). To get rid of the itemIndex
, we capture the reference to source array v
and use pointer arithmetic to calculate itemIndex
internally.
Over the years (on this and other similar sites) multiple solutions have been proposed, but usually they employ multiple "raw loops" with conditions and some erase/insert/push_back calls. The idea behind stable_partition
is explained beautifully in this talk by Sean Parent.
This link provides a similar solution (and it does not assume that selection
is sorted - std::find_if
instead of std::binary_search
is used), but it also employs a helper (incremented) variable which disables the possibility to parallelize processing on larger arrays.
Starting from C++17, there is a new first argument to std::stable_partition
(the ExecutionPolicy
) which allows auto-parallelization of the algorithm, further reducing the run-time for big arrays. To make yourself believe this parallelization actually works, there is another talk by Hartmut Kaiser explaining the internals.