Providing different implementations of a class depending on lvalue/rvalue when using expression templates
Asked Answered
H

5

20

The problem

Suppose we implement a string class which represents, uhm, strings. We then want to add an operator+ which concatenates two strings, and decide to implement that via expression templates to avoid multiple allocations when doing str1 + str2 + ... + strN.

The operator will look like this:

stringbuilder<string, string> operator+(const string &a, const string &b)

stringbuilder is a template class, which in turn overloads operator+ and has an implicit string conversion operator. Pretty much the standard textbook exercise:

template<class T, class U> class stringbuilder;

template<> class stringbuilder<string, string> {
    stringbuilder(const string &a, const string &b) : a(a), b(b) {};
    const string &a;
    const string &b;
    operator string() const;
    // ...
}

// recursive case similar,
// building a stringbuilder<stringbuilder<...>, string>

The above implementation works perfectly as long as someone does

string result = str1 + str2 + ... + strN;

However, it has a subtle bug. Assigning the result to a variable of the right type will make that variable hold references to all the strings that compose the expression. That means, for instance, that changing one of the strings will change the result:

void print(string);
string str1 = "foo";
string str2 = "bar";
right_type result = str1 + str2;
str1 = "fie";
print(result); 

This will print fiebar, because of the str1 reference stored inside the expression template. It gets worse:

string f();
right_type result = str1 + f();
print(result); // kaboom

Now the expression template will contain a reference to a destroyed value, crashing your program straight away.

Now what's that right_type? It is of course stringbuilder<stringbuilder<...>, string>, i.e. the type the expression template magic is generating for us.

Now why would one use a hidden type like that? In fact, one doesn't use it explicitely -- but C++11's auto does!

auto result = str1 + str2 + ... + strN; // guess what's going on here?

The question

The bottom line is: it seems that this way of implementing expression templates (by storing cheap references instead of copying values or using shared pointers) gets broken as soon as one tries to store the expression template itself.

Therefore, I'd pretty much like a way of detecting if I'm building a rvalue or a lvalue, and provide different implementations of the expression template depending on whether a rvalue is built (keep references) or a lvalue is built (make copies).

Is there an estabilished design pattern to handle this situation?

The only things I was able to figure out during my research were that

  1. One can overload member functions depending on this being an lvalue or rvalue, i.e.

    class C {
        void f() &; 
        void f() &&; // called on temporaries
    }
    

    however, it seems I can't do that on constructors as well.

  2. In C++ one cannot really do ``type overloads'', i.e. offer multiple implementations of the same type, depending on how the type is going to be used (instances created as lvalues or rvalues).

Hannis answered 22/7, 2013 at 20:24 Comment(15)
Rather than detecting rvalue/lvalue ref in the constructor why don't you detect this by overloading operator+ with rvalue/lvalue ref combinations.Pulp
@Pulp because that does not solve the problem.Prevailing
Don't let them declare variables of the right_type.Gilliette
@n.m. you also have to block them from declaring variables of right_type& and right_type&& and right_type const&. I can block right_type, how do we block the others? Secret types? Will that block auto?Prevailing
@Hannis the rvalue case is easy -- when creating expression templates containing rvalues, you could create a local copy and move into it. Note that the proper stringbuilder signature does perfect forwarding, not take-by-const&. So your stringbuilder<string, string> is distinct from stringbuilder<string const&, string const&> -- one has two temporaries, the other has a pair of const&s. Even if you don't store copies (and instead store string&&) you should do this, because this lets you know if you can move out of those temporaries or not.Prevailing
Doesn't operator string() &&; cut it? Sure, someone could use move, but there's not anything more we can do. This problem has been unsolved for a very long time.Savor
@Peppe: This is a known problem with auto that nobody who I asked (including real experts) could give an answer to. :-(Levant
@Yakk right_type& and right_type const& are doable if you make everything in sight private, but right_type&& is indeed a problem.Gilliette
C++chatroom suggested adding operator auto to the language :DSavor
@Yakk if we disallow all this then we won't be able to pass the objects of right_type to functions, which kind of defeats the idea. Nope, won't work.Gilliette
@MooingDuck: I'd also add operator X&() and operator X&&() ;)Gilliette
@n.m.: don't those already work? I'm pretty sure the first does at least. and operator X() handles the second pretty closely.Savor
@MooingDuck: you can declare them right now but they don't work as members of X.Gilliette
You could specialize std::move to not compile for your right_type. Then, only allow/use rvalue member functions in your right_type (e.g. conversion to string via operator string() const&&). This should disallow any use of lvalue right_type objects and right_type&&.Coincidental
@DyP except std::forward<right_type>(instance) (which you could also block), or if someone reimplements std::forward or std::move from scratch for whatever insane reason.Prevailing
L
13

I started this in a comment but it was a bit big for that. Then, let's make it an answer (even though it doens't really answer your question).

This is a known issue with auto. For instance, it has been discussed by Herb Sutter here and in more details by Motti Lanzkron here.

As they say, there were discussions in the committee to add operator auto to C++ to tackle this problem. The idea would be instead of (or in addition to) providing

operator string() const;

as you mentioned, one would provide

string operator auto() const;

to be used in type deduction contexts. In this case,

auto result = str1 + str2 + ... + strN;

would not deduce the type of result to be the "right type" but rather the type string because that's what operator auto() returns.

AFAICT this is not going to happen in C++14. C++17 pehaps...

Levant answered 22/7, 2013 at 20:57 Comment(5)
Thanks, the links are indeed useful. However, if operator auto will catch the 99% of the usages, by detecting if one is using a lvalue or a rvalue will catch the 100% of the usages (including some crazy doing stringbuilder<stringbuilder<...>,...> foo = a + b;. That's why I put my question this way, and was looking for a design pattern, although at this point I think C++ is never going to provide the necessary language features.Hannis
@Hannis if someone makes an explicit copy of your stringbuilder type, they should be allowed to shoot themselves in the foot, no? :)Prevailing
+1 for some way of forcing auto type deduction, although I'd prefer something like using auto = string; instead of string operator auto(); because this would avoid code duplication (the implementation would be the same for string operator auto(); and operator string();) and the return type in string operator auto(); looks out of place when compared to the other conversion operators.Mossbunker
@mitchnull: I like your idea and I agree with what you said.Levant
I'm accepting this one -- in the end it looks like what I'm asking for is impossible in C++ (C++11 and any foreseeable future revision).Hannis
C
2

Elaborating on a comment I made to the OP; example:

This only tackles the problem of assigning to either an object or binding to a reference and afterwards converting to a destination type. It is not a comprehensive fix the the problem (also see Yakk's response to my comment), but it prevents the scenario presented in the OP and makes it generally harder to write this kind of error-prone code.

Edit: It might not be possible to expand this approach for class templates (more specifically, the specialization of std::move). Macro'ing could work for this specific problem, but is obviously ugly. Overloading std::move would rely on UB.

#include <utility>
#include <cassert>

// your stringbuilder class
struct wup
{
    // only use member functions with rvalue-ref-qualifier
    // this way, no lvalues of this class can be used
    operator int() &&
    {
        return 42;
    }
};

// specialize `std::move` to "prevent" from converting lvalues to rvalue refs
// (make it much harder and more explicit)
namespace std
{
    template<> wup&& move(wup&) noexcept
    {
        assert(false && "Do not use `auto` with this expression!");
    }
    // alternatively: no function body -> linker error
}

int main()
{
    auto obj = wup{};
    auto& lref = obj;
    auto const& clref = wup{};
    auto&& rref = wup{};

    // fail because of conversion operator
      int iObj = obj;
      int iLref = lref;
      int iClref = clref;
      int iRref = rref;
      int iClref_mv = std::move(clref);

    // assert because of move specialization
      int iObj_mv = std::move(obj);
      int iLref_mv = std::move(lref);
      int iRref_mv = std::move(rref);

    // works
    int i = wup{};
}
Coincidental answered 22/7, 2013 at 21:54 Comment(4)
If wup is a template (as most expression template code is), then move(wup) will be a template, so you can stuff a static_assert(false,"not allowed to move") in there, and it will fail to compile when you try to call it... And do the same with std::forward maybe?Prevailing
@Yakk You raised an interesting point: There's no partial specialization for function templates, therefore this cannot work with templates at all!?Coincidental
purely by accident. I forget the rules for injecting things into the std namespace, specialization only, no overrides?Prevailing
@Yakk Yes. I don't know how to solve this for templates; even defining std::move as a friend function wouldn't work.Coincidental
C
0

Just a wild idea (haven't tried it):

template<class T, class U>
class stringbuilder
{
  stringbuilder(stringbuilder const &) = delete;
}

wouldn't force compilation error?

Columbium answered 22/7, 2013 at 21:51 Comment(3)
I don't think this fixes the cases auto const& r = ... and auto&& r = ...Coincidental
But using a reference to a temporary is not a good idea anyway, isn't it?Columbium
Depends on what you mean with "not a good idea". Binding to a const lvalue ref or a rvalue ref extends the lifetime of the temporary, so it doesn't necessarily lead to UB.Coincidental
C
0

A possible approach would be using the null object pattern. While it might make your string builder bigger, it will still avoid the memory allocations.

template <>
class stringbuilder<std::string,std::string> {
   std::string        lhs_value;
   std::string        rhs_value;
   const std::string& lhs;
   const std::string& rhs;

   stringbuilder(const std::string &lhs, const std::string &rhs) 
      : lhs(lhs), rhs(rhs) {}

   stringbuilder(std::string&& lhs, const std::string &rhs) 
      : lhs_value(std::move(lhs)), lhs(lhs_value), rhs(rhs) {}

   stringbuilder(const std::string& lhs, std::string&& rhs)
      : rhs_value(std::move(rhs)), lhs(lhs), rhs(rhs_value) {}

   stringbuilder(std::string&& lhs, std::string&& rhs)
      : lhs_value(std::move(lhs)), rhs_value(std::move(rhs)),
        lhs(lhs_value), rhs(rhs_value) {}
//...

If the argument to the constructor is an lvalue, then you store a reference to the real object. If the argument to the constructor is an rvalue, you can move that into an internal variable with almost no cost (move operations are cheap) and store a reference to that internal object. The rest of the code can access the reference knowing (well, at least hoping) that the string will still be alive.

The hoping part is because there is nothing blocking misuse if an lvalue is passed but the object is destroyed before the stringbuilder completes its job.

Client answered 22/7, 2013 at 21:55 Comment(5)
@DyP: typo in the code... the members are std::string and const std::string& (they could very well be const std::string and const std::string& but I started with non-const in case you want to detect that one of them is an rvalue and start from there (rather than an empty string) in operator std::string(), although I later thought that it would not make much sense overall.Jewelry
Ok, I almost thought so :) This fixes the second problem right_type result = str1 + f(); but not the first right_type result = str1 + str2; str1 = "fie";, right?Coincidental
@DyP: Exactly. I would have to stop and think of a design that would solve also that part of the problem.Jewelry
template<typename LHS, typename RHS> struct stringbuilder { RHS rhs; LHS lhs; stringbuilder(LHS&& lhs, RHS&& rhs):std::forward<LHS>(lhs), std::forward<RHS>(rhs); };, where you carefully set your RHS and LHS types before you pick your constructor, saves you a lot of boilerplate. If LHS&& is an lvalue, then LHS is a value, and if LHS&& is an rvalue, then LHS is a reference!Prevailing
@Yakk: Good idea (the syntax in the comment is a bit off, but I get the idea). Now, this changes the boilerplate in the constructors of stringbuilder but just moves it to operator+ where it probably belongs.Jewelry
A
0

Here is another attempt at solving the issue of dangling references. It doesn't solve the issue of references to things that are modified though.

The idea is to store the temporaries into values, but to have references to lvalues (that we can expect to keep living after the ;).

// Temporary => store a copy
// Otherwise, store a reference
template <typename T>
using URefUnlessTemporary_t
= std::conditional_t<std::is_rvalue_reference<T&&>::value
,                    std::decay_t<T>
,                    T&&>
;

template <typename LHS, typename RHS>
struct StringExpression
{
    StringExpression(StringExpression const&) = delete;
    StringExpression(StringExpression     &&) = default;

    constexpr StringExpression(LHS && lhs_, RHS && rhs_)
        : lhs(std::forward<LHS>(lhs_))
        , rhs(std::forward<RHS>(rhs_))
        { }

    explicit operator std::string() const
    {
        auto const len = size(*this);
        std::string res;
        res.reserve(len);
        append(res, *this);
        return res;
    }

    friend constexpr std::size_t size(StringExpression const& se)
    {
        return size(se.lhs) + size(se.rhs);
    }


    friend void append(std::string & s, StringExpression const& se)
    {
        append(s, se.lhs);
        append(s, se.rhs);
    }

    friend std::ostream & operator<<(std::ostream & os, const StringExpression & se)
    { return os << se.lhs << se.rhs; }

private:
    URefUnlessTemporary_t<LHS> lhs;
    URefUnlessTemporary_t<RHS> rhs;
};

template <typename LHS, typename RHS>
StringExpression<LHS&&,RHS&&> operator+(LHS && lhs, RHS && rhs)
{
    return StringExpression<LHS&&,RHS&&>{std::forward<LHS>(lhs), std::forward<RHS>(rhs) };
}

I've no doubt this could be simplified.

int main ()
{
    constexpr static auto c = exp::concatenator{};
    {
        std::cout << "RVREF\n";
        auto r = c + f() + "toto";
        std::cout << r << "\n";
        std::string s (r);
        std::cout << s << "\n";
    }

    {
        std::cout << "\n\nLVREF\n";
        std::string str="lvref";
        auto r = c + str + "toto";
        std::cout << r << "\n";
        std::string s (r);
        std::cout << s << "\n";
    }

    {
        std::cout << "\n\nCLVREF\n";
        std::string const str="clvref";
        auto r = c + str + "toto";
        std::cout << r << "\n";
        std::string s (r);
        std::cout << s << "\n";
    }
}

NB: I don't provide size(), append() nor concatenator, they aren't the points where the difficulties lie.

PS: I've used C++14 only to simplify the type traits.

Apthorp answered 30/3, 2016 at 13:40 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.