Why are move semantics necessary to elide temporary copies?

Asked 6/12, 2016 at 3:46 Answered 6/12, 2016 at 4:25

Solved c++c++11 move-semantics rvalue-reference copy-elision

So my understanding of move semantics is that they allow you to override functions for use with temporary values (rvalues) and avoid potentially expensive copies (by moving the state from an unnamed temporary into your named lvalue).

My question is why do we need special semantics for this? Why couldn't a C++98 compiler elide these copies, since it's the compiler that determines whether a given expression is an lvalue or an rvalue? As an example:

void func(const std::string& s) {
    // Do something with s
}

int main() {
    func(std::string("abc") + std::string("def"));
}

Even without C++11's move semantics, the compiler should still be able to determine that the expression passed to func() is an rvalue, and thus the copy from a temporary object is unnecessary. So why have the distinction at all? It seems like this application of move semantics is essentially a variant of copy elision or other similar compiler optimizations.

As another example, why bother having code like the following?

void func(const std::string& s) {
    // Do something with lvalue string
}

void func(std::string&& s) {
    // Do something with rvalue string
}

int main() {
    std::string s("abc");

    // Presumably calls func(const std::string&) overload
    func(s);

    // Presumably calls func(std::string&&) overload
    func(std::string("abc") + std::string("def"));
}

It seems like the const std::string& overload could handle both cases: lvalues as usual, and rvalues as a const reference (since temporary expressions are sort of const by definition). Since the compiler knows when an expression is an lvalue or an rvalue, it could decide whether to elide the copy in the case of an rvalue.

Basically, why are move semantics considered special and not just a compiler optimization that could have been performed by pre-C++11 compilers?

Tepic answered 6/12, 2016 at 3:46 Comment(2)

Could you explain how this is significantly different from my second example above? – Tepic 6/12, 2016 at 3:56

Imagine you're the compiler generating the code for func ... and there may be calls to it in other translation units... do you move out of the parameter or not? – Hardpressed 6/12, 2016 at 3:57

Move functions do not elide temporary copies, exactly.

The same number of temporaries exists, it's just that instead of calling the copy constructor typically, the move constructor is called, which is allowed to cannibalize the original rather than make an independent copy. This may sometimes be vastly more efficient.

The C++ formal object model is not at all modified by move semantics. Objects still have a well-defined lifetime, starting at some particular address, and ending when they are destroyed there. They never "move" during their life time. When they are "moved from", what is really happening is the guts are scooped out of an object that is scheduled to die soon, and placed efficiently in a new object. It may look like they moved, but formally, they didn't really, as that would totally break C++.

Being moved from is not death. Move is required to leave objects in a "valid state" in which they are still alive, and the destructor will always be called later.

Eliding copies is a totally different thing, where in some chain of temporary objects, some of the intermediates are skipped. Compilers are not required to elide copies in C++11 and C++14, they are permitted to do this even when it may violate the "as-if" rule that usually guides optimization. That is even if the copy ctor may have side effects, the compiler at high optimization settings may still skip some of the temporaries.

By contrast, "guaranteed copy ellision" is a new C++17 feature, which means that the standard requires copy ellision to take place in certain cases.

Move semantics and copy ellision give two different approaches to enabling greater efficiency in these "chain of temporaries" scenarios. In move semantics, all the temporaries still exist, but instead of calling the copy constructor, we get to call a (hopefully) less expensive constructor, the move constructor. In copy ellision, we get to skip some of the objects all together.

Basically, why are move semantics considered special and not just a compiler optimization that could have been performed by pre-C++11 compilers?

Move semantics are not a "compiler optimization". They are a new part of the type system. Move semantics happens even when you compile with -O0 on gcc and clang -- it causes different functions to be called, because, the fact that an object is about to die is now "annotated" in the type of reference. It allows "application level optimizations" but this is different from what the optimizer does.

Maybe you can think of it as a safety-net. Sure, in an ideal world the optimizer would always eliminate every unnecessary copy. Sometimes, though, constructing a temporary is complex, involves dynamic allocations, and the compiler doesn't see through it all. In many such cases, you will be saved by move semantics, which might allow you to avoid making a dynamic allocation at all. That in turn may lead to generated code that is then easier for the optimizer to analyze.

The guaranteed copy ellision thing is sort of like, they found a way to formalize some of this "common sense" about temporaries, so that more code not only works the way you expect when it gets optimized, but is required to work the way you expect when it gets compiled, and not call a copy constructor when you think there shouldn't really be a copy. So you can e.g. return non-copyable, non-moveable types by value from a factory function. The compiler figures out that no copy happens much earlier in the process, before it even gets to the optimizer. This is really the next iteration of this series of improvements.

Buchalter answered 6/12, 2016 at 3:57 Comment(2)

Ok, I think I understand now. Move semantics in this case are a "more efficient copy" that nevertheless takes place on a temporarily-constructed object. Would it be possible for you to provide examples of situations where the compiler would be unable to elide such temporaries? I would think in my example that it should be possible, but I don't know how to test for sure. – Tepic 6/12, 2016 at 12:50

Out of this answer what I like most: "Sure, in an ideal world the optimizer would always eliminate every unnecessary copy. Sometimes, though, constructing a temporary is complex, involves dynamic allocations, and the compiler doesn't see through it all." Though some examples would nice which show when it's almost impossible for compiler to avoid copies which move semantics can avoid. – Deipnosophist 28/2, 2018 at 8:48

Copy elision and move semantics are not exactly the same. With copy elision, the entire object is not copied, it stays in place. With a move, "something" still gets copied. The copy is not really eliminated. But that "something" is a pale shadow of what a full-blown copy has to haul.

A simple example:

class Bar {

    std::vector<int> foo;

public:

    Bar(const std::vector<int> &bar) : foo(bar)
    {
    }
};

std::vector<int> foo();

int main()
{
     Bar bar=foo();
}

Good luck trying to get your compiler to eliminate the copy, here.

Now, add this constructor:

    Bar(std::vector<int> &&bar) : foo(std::move(bar))
    {
    }

And now, the object in the main() gets constructed using a move operation. The full copy has not actually been eliminated, but the move operation is just some line noise.

On the other hand:

Bar foo();

int main()
{
     Bar bar=foo();
}

That's going to get a full copy-elision here. Nothing gets copied copied.

In conclusion: move semantics does not actually elide, or eliminate a copy. It just makes the resulting copy "less".

Deane answered 6/12, 2016 at 3:59 Comment(1)

If Bar class has a constructor taking const std::vector& then compiler could probably generate move constructor automatically taking rvalue vector for this case when passing temporary. This is probably complicated for compiler to do and not according to current C++ standards, but should be possible. – Deipnosophist 28/2, 2018 at 8:41

You have a fundamental misunderstanding of how certain things in C++ work:

Even without C++11's move semantics, the compiler should still be able to determine that the expression passed to func() is an rvalue, and thus the copy from a temporary object is unnecessary.

That code does not provoke any copying, even in C++98. A const& is a reference not a value. And because it's const, it is perfectly capable of referencing a temporary. As such, a function taking a const string& never gets a copy of the parameter.

That code will create a temporary and pass a reference to that temporary to func. No copying happens, at all.

As another example, why bother having code like the following?

Nobody does. A function should only take a parameter by rvalue-reference if that function will move from it. If a function is only going to observe the value without modifying it, they take it by const&, just like in C++98.

Most important of all:

So my understanding of move semantics is that they allow you to override functions for use with temporary values (rvalues) and avoid potentially expensive copies (by moving the state from an unnamed temporary into your named lvalue).

Your understanding is wrong.

Moving is not solely about temporary values; if it was, we wouldn't have std::move that allows us to move from lvalues. Moving is about transfering ownership of data from one object to another. While that frequently does happen with temporaries, it can also happen with lvalues:

std::unique_ptr<T> p = ...
std::unique_ptr<T> other_p = std::move(p);
assert(p == nullptr); //Will always be true.

This code creates a unique_ptr, then moves the contents of that pointer into another unique_ptr object. It is not dealing with temporaries; it is transferring ownership of the internal pointer to another object.

This is not something a compiler could deduce that you wanted to do. You have to be explicit that you want to perform such a move on an lvalue (which is why std::move is there).

Marmoset answered 6/12, 2016 at 4:25 Comment(1)

About your last example: theoretically compiler could possibly see that p is never used again after being assigned to other_p so it could elide copy and just move/reassign pointers. I think that's the point of author - he thinks that almost all (if not all) features of move semantics could be done by smart enough compiler although it might not be practically feasible for some cases. – Deipnosophist 28/2, 2018 at 8:45

The answer is that move semantics was introduced not for eliminating copies. It was introduced to allow/promote cheaper copying. For example, if all data members of a class are simple integers, copy semantics will be the same to move semantics. In this case it does not make sense to define move ctor and move assignment operator for this class. Move ctor and move assignment make sense when class has something that can be moved.

There are tons of articles on this subject. Nevertheless some notes:

Once parameter is specified as T&& it is clear to everybody that it is ok to steal its contents. Dot. Simple and clear. In C++03 there was no clear syntax or any other established convention to convey this idea. In fact, there are tons of other way of expressing the same thing. But committee has chosen this way.
Move semantic is useful not only with rvalue references. It can be used anywhere where you want to indicate that you want to pass your object to the function and that function may take its contents.

You may have this code:

void Func(std::vector<MyComplexType> &v)
{
    MyComplexType x;
    x.Set1();          // Expensive function that allocates storage
                       // and computes something.
    .........          // Ton of other code with if statements and loops
                       // that builds the object x.

    v.push_back(std::move(x));  // (1)

    x.Set2();         // Code continues to use x. This is ok.        
}

Note that in the line (1) move ctor will be used and object will be added for cheaper price. Note that object is not dieing at this line and there are no temporaries there.

Garton answered 6/12, 2016 at 3:58 Comment(2)

Nice example which shows you can still use object after it's contents have been stolen (moved). Not sure though if I've seen it often in real life situations. – Deipnosophist 28/2, 2018 at 8:53

Standard is explicit that after move ctor/assignment the object should be in correct state, so that it can be safely destructed. – Garton 28/2, 2018 at 18:51

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags