How to flatten the nested std::optional?
Asked Answered
S

3

19

note: this question was briefly marked as a duplicate of this, but it is not an exact duplicate since I am asking about std::optionals specifically. Still a good question to read if you care about general case.

Assume I have nested optionals, something like this(dumb toy example):

struct Person{
    const std::string first_name;
    const std::optional<std::string> middle_name;
    const std::string last_name;
};
struct Form{
    std::optional<Person> person;
};

and this spammy function:

void PrintMiddleName(const std::optional<Form> form){
    if (form.has_value() && form->person.has_value() && form->person->middle_name.has_value()) {
        std::cout << *(*(*form).person).middle_name << std::endl; 
    } else {
        std::cout << "<none>"  << std::endl; 
    }
}

What would be the best way to flatten this optional check? I have made something like this, it is not variadic, but I do not care that much about that(I can add one more level(overload with membr3) if really necessary, and everything beyond that is terrible code anyway).

template<typename T, typename M>
auto flatten_opt(const std::optional<T> opt, M membr){
    if (opt.has_value() && (opt.value().*membr).has_value()){
        return std::optional{*((*opt).*membr)};
    }
    return decltype(std::optional{*((*opt).*membr)}){};
}

template<typename T, typename M1, typename M2>
auto ret_val_helper(){
    // better code would use declval here since T might not be 
    // default constructible.
    T t;
    M1 m1;
    M2 m2;
    return ((t.*m1).value().*m2).value();
}

template<typename T, typename M1, typename M2>
std::optional<decltype(ret_val_helper<T, M1, M2>())> flatten_opt(const std::optional<T> opt, M1 membr1, M2 membr2){
    if (opt.has_value() && (opt.value().*membr1).has_value()){
        const auto& deref1 = *((*opt).*membr1);
        if ((deref1.*membr2).has_value()) {
            return std::optional{*(deref1.*membr2)};
        }
    }
    return {};
}

void PrintMiddleName2(const std::optional<Form> form){
    auto flat  = flatten_opt(form, &Form::person, &Person::middle_name);
    if (flat) {
        std::cout << *flat;
    }
    else {
        std::cout << "<none>"  << std::endl; 
    }
}

godbolt

notes:

  • I do not want to switch away from std::optional to some better optional.
  • I do not care that much about perf, unless I return a pointer I must make copy(unless arg is temporary) since std::optional does not support references.
  • I do not care about flatten_has_value function(although it is useful), since if there is a way to nicely flatten the nested optionals there is also a way to write that function.
  • I know my code looks like it works, but it is quite ugly, so I am wondering if there is a nicer solution.
Seasoning answered 2/4, 2021 at 17:9 Comment(17)
A less-spamy if (form.has_value() && form->person.has_value() && form->person->middle_name.has_value()) would be if (form && form->person && form->person->middle_name). A less-spamy *(*(*form).person).middle_name would be form->person->middle_name.Ahner
It's a bit confusing to me that you want to use optional, but you're OK with getting a default-constructed value from it if it is empty. Wouldn't that mean that you would be unable to distinguish between an optional that's empty and an optional that happens to contain a default-constructed value? So why use optional at all?Centesimo
@DrewDormann tnx, also I think you are missing one star, *form->person->middle_nameSeasoning
@NicolBolas not sure I understand the question. Optional in my usecase is just a way to represent some members that may or may not be present in the structs.Seasoning
@NoSenseEtAl: But your flatten function requires that the type be default constructible and that a default-initialized object of that value is conceptually the same thing as "not being present". Because that's how your flatten function treats it. The only reason to have an optional<T> when T is default constructible is to recognize a distinction between T{} and not being present. Your code obliterates that distinction.Centesimo
@NicolBolas still not following, if you are talking about ret_val_helper it is just here to compute the type, it is never run at runtime.Seasoning
Using std::optional for std::string rarely makes sense. Certainly in this case, as there is no need to differentiate between a missing middle name vs a blank middle name. So I would change const std::optional<std::string> middle_name; to const std::string middle_name; and then use if (form && form->person && !form->person->middle_name.empty()) { */ use form->person->middle_name as needed */ }Hulse
@NoSenseEtAl: Consider middle_name. What is the difference between middle_name which has a value of an empty string and middle_name which has no value? If the answer is "there is never a difference", if none of your code ever treats these two cases as different, then it shouldn't be optional.Centesimo
@RemyLebeau it is a toy example, real code could have been a complex struct/class and calling code could notice difference between default constructed and nullopt.Seasoning
@NicolBolas std::optional<Form> f1{Form{.person = Person{"Bjarne", "", "Stroustrup"}}}; prints nothing (empty string), std::optional<Form> f1{Form{.person = Person{"Bjarne", std::nullopt, "Stroustrup"}}}; prints <none>Seasoning
@NoSenseEtAl: But flatten_opt wouldn't do that. At least, not your first one.Centesimo
@NicolBolas yes, that is correct, I only distinguish between all_of optionals are active and ! all_of are active, example: if we get 0 or 1 level deep with chained calls to has_value and then 2 levels deep has_value return false we return nullopt, only if we go all the way 2 levels deep we return the last optional. No difference if we fail at level 0 or at level 1.Seasoning
This article seems appropriate...if you wanted to be pedantic, all the data members would be std::optional.Tynishatynwald
This seems to answer your question. What's the problem?Lenhard
@AyxanHaqverdili compare Barry's operator overloading there with Mooing Duck operator overloading here. Mooing Duck is much nicer IMAO. Reason being that this question is a specific subset of "duplicate" question so that allows Mooing Duck answer to use that one argument is std::optional. Additionally expecting that most or even significant minority of people visiting duplicate will be able to get to Mooing Duck answer by adopting Barry's answer is IMAO very unlikely.Seasoning
@Seasoning I meant the access function in Barry's answer. Here's how you can make it work for std::optional trivially. I think it's an exact duplicate, tackling the exact same issue. changing T* to std::optional<T> or std::unique_ptr<T> or some other fancy pointer doesn't warrant a new question imo.Lenhard
@AyxanHaqverdili wrt that I agree, it is variadic version of code in this question. Plus I did not know you can use invoke on stuff that is not a function. :)Seasoning
U
14

The operation you're looking for is called the monadic bind operation, and is sometimes spelled and_then (as it is in P0798 and Rust).

You're taking an optional<T> and a function T -> optional<U> and want to get back an optional<U>. In this case the function is a pointer to data member, but it really does behave as a function in this sense. &Form::person takes a Form and gives back an optional<Person>.

You should write this in a way that is agnostic to the kind of function. The fact that it's specifically a pointer to member data isn't really important here, and maybe tomorrow you'll want a pointer to member function or even a free function. So that's:

template <typename T,
          typename F,
          typename R = std::remove_cvref_t<std::invoke_result_t<F, T>>,
          typename U = mp_first<R>>
    requires SpecializationOf<R, std::optional>
constexpr auto and_then(optional<T> o, F f) -> optional<U>
{
    if (o) {
        return std::invoke(f, *o);
    } else {
        return std::nullopt;
    }
}

This is one of the many kinds of function declarations that are just miserable to write in C++, even with concepts. I'll leave it as an exercise to properly add references into there. I choose to specifically write it as -> optional<U> rather than -> R because I think it's important for readability that you can see that it does, in fact, return some kind of optional.

Now, the question is how do we chain this to multiple functions. Haskell uses >>= for monadic bind, but in C++ that has the wrong association (o >>= f >>= g would evaluate f >>= g first and require parentheses). So the next closest chose of operator would be >> (which means something different in Haskell, but we're not Haskell, so it's okay). Or you could implement this borrowing the | model that Ranges does.

So we'd either end up syntactically with:

auto flat  = form >> &Form::person >> &Person::middle_name;

or

auto flat = form | and_then(&Form::person)
                 | and_then(&Person::middle_name);

A different way to compose multiple monadic binds together is an operation that Haskell spells >=>, which is called Kleisli composition. In this case, it takes a function T -> optional<U> and a function U -> optional<V> and produces a function T -> optional<V>. This is something that is exceedingly annoying to write constraints for so I'm just going to skip it, and it would look something like this (using the Haskell operator spelling):

template <typename F, typename G>
constexpr auto operator>=>(F f, G g) {
    return [=]<typename T>(T t){
        using R1 = std::remove_cvref_t<std::invoke_result_t<F, T>>;
        static_assert(SpecializationOf<R1, std::optional>);
        using R2 = std:remove_cvref_t<std::invoke_result_t<G, mp_first<R1>>>;
        static_assert(SpecializationOf<R2, std::optional>);

        if (auto o = std::invoke(f, t)) {
            return std::invoke(g, *o);
        } else {
            // can't return nullopt here, have to specify the type
            return R2();
        }
    };
}

And then you could write (or at least you could if >=> were an operator you could use):

auto flat  = form | and_then(&Form::person >=> &Person::middle_name);

Because the result of >=> is now a function that takes a Form and returns an optional<string>.

Unofficial answered 2/4, 2021 at 23:27 Comment(2)
Great answer! minor suggestion for future answers: if you have some collection of concepts you keep you could maybe add them somewhere public and then link them in future when you have posts like this(just do not try to write Sortable concept, it is well known one person can not write it ;) ). For example I know what SpecializationOf is since I remember the questions about this, but I doubt most people reading this answer would know how to google it/find it on SO.Seasoning
There is the overloadable binary ptr-to-memptr-bind operator (operator ->*): auto flat= form ->* &Form::person ->* &Person::middle_name; en.cppreference.com/w/cpp/language/operator_precedence . Check NO.4 in the table; The associativity is also LTR; which is good.Williamwilliams
C
5

Let's look at what the optimal form of a flatten function would look like. By "optimal" in this case, I mean the smallest presentation.

Even in the optimal case, at the point of performing the flatten operation, you would need to provide:

  1. The optional<T> object to flatten.
  2. The flatten operation function name.
  3. A list of names, in order, to be indirected from at each flattening step.

Your code is very close to optimal. The only issue is that each name in the "list of names" must contain the typename of the member you're accessing at that level, which is something that hypothetically could be computed using knowledge of T.

C++ has no mechanism to do any better than this. If you want to access a member of an object, you must provide the type of that object. If you want to indirectly do this, C++ allows member pointers, but getting such a pointer requires knowing the type of the object at the point when the member is extracted. Even offsetof gymnastics would require using the type name when you're getting the offset.

Reflection would allow for something better, as you could pass compile-time strings that static reflection could use to fetch member pointers from the type currently in use. But C++20 has no such feature.

Centesimo answered 2/4, 2021 at 17:31 Comment(0)
M
5

You've got a lot of helper functions for something that is fundamentally a chainable operation. And C++ has things for chains: operators. So I'd probably (ab)use operator* for this.

For your specific case, all you need is

template<class class_t, class member_t>
std::optional<std::remove_cv_t<member_t>> operator*(
        const std::optional<class_t>& opt, 
        const std::optional<member_t> class_t::*member) 
{
    if (opt.has_value()) return opt.value().*member;
    else return {};
}

void PrintMiddleName2(const std::optional<Form> form){
    auto middle = form * &Form::person * &Person::middle_name;
    if (middle) {
        std::cout << *middle;
    }
    else {
        std::cout << "<none>"  << std::endl; 
    }
}

But in reality you'd probably also want variants for non-optional members, getter methods, and arbitrary transforms, which I've listed here, though I'm not 100% certain they all compile properly.

//data member
template<class class_t, class member_t>
std::optional<std::remove_cv_t<member_t>> operator*(const std::optional<class_t>& opt, const std::optional<member_t> class_t::*member) {
    if (opt.has_value()) return opt.value().*member;
    else return {};
}
template<class class_t, class member_t>
std::optional<std::remove_cv_t<member_t>> operator*(const std::optional<class_t>& opt, const member_t class_t::*member) {
    if (opt.has_value()) return {opt.value().*member};
    else return {};
}

//member function
template<class class_t, class return_t>
std::optional<std::remove_cv_t<return_t>> operator*(const std::optional<class_t>& opt, std::optional<return_t>(class_t::*member)()) {
    if (opt.has_value()) return opt.value().*member();
    else return {};
}
template<class class_t, class return_t>
std::optional<std::remove_cv_t<return_t>> operator*(const std::optional<class_t>& opt, return_t(class_t::*member)()) {
    if (opt.has_value()) return {opt.value().*member()};
    else return {};
}

//arbitrary function
template<class class_t, class return_t, class arg_t>
std::optional<std::remove_cv_t<return_t>> operator*(const std::optional<class_t>& opt, std::optional<return_t>(*transform)(arg_t&&)) {
    if (opt.has_value()) return transform(opt.value());
    else return {};
}
template<class class_t, class return_t, class arg_t>
std::optional<std::remove_cv_t<return_t>> operator*(const std::optional<class_t>& opt, return_t(*transform)(arg_t&&)) {
    if (opt.has_value()) return {transform(opt.value())};
    else return {};
}

http://coliru.stacked-crooked.com/a/26aa7a62f38bbd89

Messina answered 2/4, 2021 at 17:37 Comment(1)
I usually hate defining operators, but this is quite nice... :)Seasoning

© 2022 - 2025 — McMap. All rights reserved.