What is the best approach for using std::ranges/std::views with std::expected in C++23?

Asked 25/2, 2023 at 23:33 Answered 31/8, 2024 at 16:49

Let me describe a scenario, first we have a function which returns us some sort of data which we can't be sure about the validity of, i.e.

auto random_predicate() -> bool
{
    int v = uniform_dist(e1); // uniform distribution 1 to 100
    return (v % 5);
}

where uniform_dist() is an appropriately seeded uniform distribution, and we have an enum class which we shall use for error handling, i.e.

enum class Error
{
    ValueError
};

Then we perform some sort of views-based processing which uses random_predicate() from within an operation, as follows:

std::vector<int> vs{1,2,3,4,5};

auto result = vs
   | views::filter([](int i){ return i % 2; })
   | views::transform([](int i) -> std::expected<int, Error> {
      auto v = random_predicate();
      if (v) return std::unexpected<Error>(Error::ValueError);
      else return i * i; 
   });

So, by the end of this operation, we can assert

static_assert(
    std::is_same_v<
        std::decay_t<std::ranges::range_value_t<result>>, 
        std::expected<int, Error>
    >
)

will in fact be true.

The question is, then what? We know have a view of std::expected values which we need to resolve to either: an error type which propagates up the call stack, or a view of the success type (i.e. a view of int in the above example (with all elements not a multiple of 5!))

My solution

My solution is to simply check if each element is in error and then if there are no errors transform the result to the desired view, so something like

template<typename T>
static auto has_error(const std::expected<T, Error>& e){ return !e.has_value(); };

auto f(const std::vector<int>& vs)
{    
    auto c = vs
        | views::filter([](int i){ return i % 2; })
        | views::transform([](int i) -> std::expected<int, Error> {
            auto v = random_predicate();
            if (v) return std::unexpected<Error>(Error::ValueError);
            else return i * i; 
        });

    if (auto v = ranges::find_if(c, has_error<int>); v != c.end()) 
    {
        return (*v).error();
    }
    else 
    {
        return c | views::transform([](auto&& e){ return e.value(); });
    }
}

But then we run into the problem that the function cannot deduce the return type to be std::expected<T, Error> where T is the type of a container with elements of type (in the case of the above example) int. And well, I dont even know what to write for T here, so my question is how should this be implemented?

godbolt: https://godbolt.org/z/Wfjr8o3qM

Alternatively, I'm interested in hearing how others may approach this problem in a better way all together?

Thanks

Edit: I suppose, you dont really want to return a view of some elements as it may lead to a dangling view? In that case, is it best to just use ranges::to<T>() when returning from a function?

Exemplar answered 25/2, 2023 at 23:33 Comment(2)

Definitely not an easy question. If performance is not an issue and you can use external libraries, #67717280 suggests to use any_view from ranges-v3. Regarding your edit, I think that as long as your original data is valid in the scope of where you use the views it should be fine (like with the code on SO, not with the one in the Godbolt) – Residue 26/2, 2023 at 0:4

Thank you, yes, I updated the question to move the vector out of the function for that very reason, but forgot to update the godbolt! Thank you for the link. – Exemplar 26/2, 2023 at 0:14

Another option to still be able to return the range:

auto f(const std::vector<int> &vs) {
    auto c = vs
             | views::filter([](int i) { return i % 2; })
             | views::transform([](int i) -> std::expected<int, Error> {
        auto v = random_predicate();
        if (v) return std::unexpected<Error>(Error::ValueError);
        else return i * i;
    });

    auto values = c | views::transform([](auto &&e) { return e.value(); });

    using success_t = decltype(values);
    using ret_t = std::expected<success_t, Error>;

    if (auto v = ranges::find_if(c, has_error<int>); v != c.end()) {
        return ret_t(std::unexpected<Error>((*v).error()));
    } else {
        return ret_t(values);
    }
}

It uses the fact that values as it is a view is computed lazily. So if it is not returned, values will never be computed, and we can use it to determine the return type. Next step is to ensure all returns are wrapped into the ret_t type so that auto can guess correctly.

Note: both in this answer and your original question, this only works in the original range can be iterated on multiple time (can't remember the name of this concept sorry)

Residue answered 26/2, 2023 at 0:37 Comment(1)

"both in this answer and your original question, this only works in the original range can be iterated on multiple time" Namely, ranges::forward_range. – Ahasuerus 26/2, 2023 at 3:51

The first thing to note is that since the lambda passed into views::transform uses random_predicate() internally, it will produce different results each time it is invoked, which violates equality-preserving, so the lambda does not meet the regular_invocable required by transform_view, you've got a UB.

As this answer illustrates, suppose you have a range of std::expected<int, E>, you may want to collect them into std::expected<std::vector<int>, E>, which is quite useful (That's why ranges::to does not constrain the return type to be a range in the end).

Although it is still not possible to get the "correct" behavior described by the answer via ranges::to<expected> for the current standard, it is not difficult to implement that:

template<ranges::input_range R>
  requires is_expected<ranges::range_value_t<R>>
auto to_expected(R&& r) {
  using expected_type = ranges::range_value_t<R>;
  using value_type = expected_type::value_type;
  using error_type = expected_type::error_type;
  using return_type = std::expected<std::vector<value_type>, error_type>;

  std::vector<value_type> v;
  for (auto it = r.begin(); it != r.end(); ++it) {
    if ((*it).has_value())
      v.push_back((*it).value());
    else
      return return_type(std::unexpect, (*it).error());
  }
  return return_type(std::move(v));
}

Then in your example, you can pass the range of std::expected<int, E> into this function to get a std::expected<std::vector<int>, E>

auto f() {
  std::vector<int> vs{1,2,3,4,5};
  auto c = vs | views::filter([](int i){ return i % 2; });

  // make a range of std::expected
  std::vector<std::expected<int, Error>> r;
  for (int i : c) {
    if (random_predicate())
      r.emplace_back(std::unexpected<Error>(Error::ValueError));
    else
      r.emplace_back(i * i);
  }

  return to_expected(r);
}

which ensures we only traverse the original range once, making it work even if it is an input_range.

Demo

Psychodiagnosis answered 26/2, 2023 at 10:42 Comment(2)

Very nice, thank you. I did not know that it was undefined behaviour, but this is a very useful bit of code! – Exemplar 26/2, 2023 at 13:11

That's UB, which is likely to produce unexpected results, eg. godbolt.org/z/j1KqxYncP – Ahasuerus 26/2, 2023 at 13:34

I think what you are looking for is the equivalent of the collect function in rust

This is a great tool and very useful when working with both ranges and expected or optional.

I was working on a similar case and @Caleth provided me a prototype of this function implemented in c++23. I tried to play a little with it, it's great and it does exactly what we expect it to do.

I added a few modifications and my current version is here.

Simple use case:

#include <fmt/ranges.h>
int main() {
    std::vector<std::expected<int, std::string>> has_error = {
        1, 2, std::unexpected("NOT INT")};
    std::vector<std::expected<int, std::string>> no_error = {1, 2, 3};

    std::expected<std::vector<int>, std::string> exp_error = has_error 
        | views::collect();
    auto exp_value = no_error | views::collect();

    auto print = [](const auto& expected) {
        if (expected.has_value())
            fmt::println("Valid result : {}", expected.value());
        else
            fmt::println("Error : {}", expected.error());
    };

    print(exp_error);
    print(exp_value);
}

Output :

Error : NOT INT
Valid result : [1, 2, 3]

This is really a minimal prototype, there's a lot of optimization to be done, but above all a lot of generalization.

Many thanks to @caleth for his first prototype, which was the major part of the work on this function.

--- EDIT ---

I made a public github for this feature available here

Accordant answered 31/8, 2024 at 16:49 Comment(0)

My solution

Recommended topics

Hot tags