What does the vertical pipe | mean in the context of c++20 and ranges?
Asked Answered
I

1

15

There are usages of | which look more like function pipe-lining or chaining rather than a bitwise or, seen in combination with the c++20 ranges. Things like:

#include <ranges>
#include <vector>

template<typename T>
std::vector<T> square_vector(const std::vector<T> &some_vector) {
    auto result = some_vector | std::views::transform([](T x){ return x*x; };
    return {result.begin(), result.end()};
}

... where clearly the | operator is not used in the usual sense of the bitwise or. Since when does it work, and on what sort of functions/objects? Are these like regular views? What are some caveats?

Inarticulate answered 21/4, 2022 at 11:56 Comment(4)
Is this question and your own answer an attempt to revive the (now shut-down) StackOverflow Documentation? Or Wiki? I mean, this info can just be found on cppreferenceCapitalization
@Capitalization I never read articles, so no. What you write is true, but I wanted a much shorter, easily ready QA to make this understandable (for me it took a bit once I saw it), not to mention that looking for "pipeline" and the like sends you no where. So, my goal is searchable on the syntax, and short. Also, having things on another site was never a deterrent as far as I know.Inarticulate
Also check the range v3 docs, as that was the base for ranges in c++20Capitalization
@Capitalization Thanks! I did go over much of it. I don't think reproducing much of the documents is necessary, and answers as they come on specific things (as they have) should be fine. I think this post is condensed enough to point someone in the right direction once they see the pipes, but you made me realize I should add the documentation links to the answer, so double thanks.Inarticulate
I
16

This sort of function chaining has been introduced with C++20 ranges, with the biggest feature allowing lazy evaluation of operation on views (more precisely, viewable ranges). This means the operation transforming the view will only act on it as it is iterated.

This semantic allows for the pipeline syntax sugar, putting in a readable way what will happen when the result is iterated. The functions this is used with are based on range adaptors, which take a view (and possibly additional arguments after it) and transform it as they are iterated (essentially returning another view).

The pipeline syntax is reserved for a special sub group of these called range adaptor closures, which only take a single view with no additional parameters. These can be either adaptors with no additional arguments, adaptors with the excess arguments bound, or the result of some library functions such as the std::views::transform in the OP. Since cpp23 you can also define these yourself). Once we have some of these, the syntax:

std::views::some_adaptor_closure | some_other_adaptor_closure

combines such closures into one which can be re-used, and to use closures you pipe initially a viewable range:

some_viewable_range | std::views::some_adaptor_closure | some_other_adaptor_closure

which is equivalent to

some_other_adaptor_closure(std::views::some_adaptor_closure(some_viewable_range))

which will evaluate the pipeline as the returned view is iterated. Similarly,

some_vector | std::views::transform([](T x){ return x*x; });

is the same as

std::views::transform([](T x){ return x*x; })(some_vector); // The first call returns the adaptor std::views::transform(some_vector, [](T x){ return x*x; }) with the second argument bound.

but more readable.

The resulting view, like any view, can be iterated directly. Since evaluation of pipes is lazy bad things can happen such as:

template<typename T>
auto square_vector(const std::vector<T> &some_vector) {
    return some_vector | std::views::transform([](T x){ return x*x; });
}

int main () {
    for(auto val : square_vector(std::vector<int>{1, 2 ,3, 4, 5}))
        std::cout << val << '\n';
}

by the time you get to print your val, the original vector does not exist, so the input to the chain is gone, and it goes down hill from there. This specific problem is fixed in the C++23 standard, see https://open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2718r0.html - but note support is not yet universal in June 2024. Care still must be take care though to ensure the lifetime of the original iterable is extended beyond the use of the view.

To delve further into the world of ranges and adaptors you can check https://en.cppreference.com/w/cpp/ranges, and the original library these were based on, https://ericniebler.github.io/range-v3/.

Inarticulate answered 21/4, 2022 at 12:30 Comment(4)
You don't need to start a pipeline with a viewable range. auto adaptor = std::views::some_adaptor_closure | some_other_adaptor_closure; defines a range adaptor closure.Kilovolt
@Kilovolt Indeed, that was just an example of some common use case I've seen, I can make it more explicit.Inarticulate
I think that in C++23, temporaries in the init statement of a range-based for loop will be extended to the entire loop removing that problem: open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2718r0.html. At least according to cppreference, this is only supported in Clang 19 and not at all in GCC. I will be very happy when it is widely supported.Astrionics
@Astrionics that's a great update, thanks, I'll add it to the answer.Inarticulate

© 2022 - 2024 — McMap. All rights reserved.