Implementing Move Constructor by Calling Move Assignment Operator
Asked Answered
L

6

31

The MSDN article, How to: Write a Move Constuctor, has the following recommendation.

If you provide both a move constructor and a move assignment operator for your class, you can eliminate redundant code by writing the move constructor to call the move assignment operator. The following example shows a revised version of the move constructor that calls the move assignment operator:

// Move constructor.
MemoryBlock(MemoryBlock&& other)
   : _data(NULL)
   , _length(0)
{
   *this = std::move(other);
}

Is this code inefficient by doubly initializing MemoryBlock's values, or will the compiler be able to optimize away the extra initializations? Should I always write my move constructors by calling the move assignment operator?

Lowering answered 14/6, 2013 at 22:36 Comment(1)
Just a note. That approach does not work if some of your data members are not default constructable.Equalize
F
18

[...] will the compiler be able to optimize away the extra initializations?

In almost all cases: yes.

Should I always write my move constructors by calling the move assignment operator?

Yes, just implement it via move assignment operator, except in the cases where you measured that it leads to suboptimal performance.


Today's optimizer do an incredible job at optimizing code. Your example code is especially easy to optimize. First of all: the move constructor will be inlined in almost all cases. If you implement it via move assignment operator, that one will be inlined as well.

And let's look at some assembly! This shows the exact code from the Microsoft website with both versions of the move constructor: manual and via move assignment. Here is the assembly output for GCC with -O (-O1 has the same output; clang's output leads to the same conclusion):

; ===== manual version =====           |   ; ===== via move-assig =====
MemoryBlock(MemoryBlock&&):            |   MemoryBlock(MemoryBlock&&):
    mov     QWORD PTR [rdi], 0         |       mov     QWORD PTR [rdi], 0
    mov     QWORD PTR [rdi+8], 0       |       mov     QWORD PTR [rdi+8], 0
                                       |       cmp     rdi, rsi
                                       |       je      .L1
    mov     rax, QWORD PTR [rsi+8]     |       mov     rax, QWORD PTR [rsi+8]
    mov     QWORD PTR [rdi+8], rax     |       mov     QWORD PTR [rdi+8], rax
    mov     rax, QWORD PTR [rsi]       |       mov     rax, QWORD PTR [rsi]
    mov     QWORD PTR [rdi], rax       |       mov     QWORD PTR [rdi], rax
    mov     QWORD PTR [rsi+8], 0       |       mov     QWORD PTR [rsi+8], 0
    mov     QWORD PTR [rsi], 0         |       mov     QWORD PTR [rsi], 0
                                       |   .L1:
    ret                                |       rep ret

Apart from the additional branch for the right version, the code is exactly the same. Meaning: duplicate assignments have been removed.

Why the additional branch? The move assignment operator as defined by the Microsoft page does more work than the move constructor: it is protected against self-assignment. The move constructor is not protected against that. But: as I already said, the constructor will be inlined in almost all cases. And in these cases, the optimizer can see that it's not a self assignment, so this branch will be optimized out, too.


This get's repeated a lot, but it's important: don't do premature micro-optimization!

Don't get me wrong, I also hate software that wastes a lot of resources due to lazy or sloppy developers or management decisions. And saving energy is not just about batteries, but also an environmental topic, which I am very passionate about. But, doing micro-optimizations prematurely doesn't help in that regard! Sure, keep the algorithmic complexity and cache friendliness of your large data in the back of your head. But before you do any specific optimization, measure!

In this specific case, I would even guess that you will never have to hand optimize, because the compiler will always be able to generate optimal code around your move constructor. Doing the useless micro-optimization now will cost you development time later when you need to change code in two places or when you need to debug a strange error that only happens because you changed code in only one place. And that is wasted development time that could rather be spent doing useful optimizations.

Fearfully answered 8/12, 2018 at 19:4 Comment(1)
This is simply not feasible for a class MoveConstructible but not MoveAssignable. It makes troubles when the move constructor and the move assignment operator have different exception specifications. Also note it would fail horrifically for allocator-aware containers which have different allocator propagation strategies on different operations. But the root issue is the logic one. That is, in general, move assignment should rely on the move constructor, but not inverse. The necessity of the redundant member initialization here also suggests some non-idiomatic things have gone wrong.Flippant
A
15

I wouldn't do it this way. The reason for the move members to exist in the first place is performance. Doing this for your move constructor is like shelling out megabucks for a super-car and then trying to save money by buying regular gas.

If you want to reduce the amount of code you write, just don't write the move members. Your class will copy just fine in a move context.

If you want your code to be high performance, then tailor your move constructor and move assignment to be as fast as possible. Good move members will be blazingly fast, and you should be estimating their speed by counting loads, stores and branches. If you can write something with 4 load/stores instead of 8, do it! If you can write something with no branches instead of 1, do it!

When you (or your client) put your class into a std::vector, a lot of moves can get generated on your type. Even if your move is lightning fast at 8 loads/stores, if you can make it twice as fast, or even 50% faster with only 4 or 6 loads/stores, imho that is time well spent.

Personally I'm sick of seeing waiting cursors and am willing to donate an extra 5 minutes to writing my code and know that it is as fast as possible.

If you're still not convinced this is worth it, write it both ways and then examine the generated assembly at full optimization. Who knows, your compiler just might be smart enough to optimize away extra loads and stores for you. But by this time you've already invested more time than if you had just written an optimized move constructor in the first place.

Update 11 years later...

I note with some amusement that the referenced MSDN article no longer advises to write the move constructor in terms of the move assignment operator. And yet, it still gets it sort of wrong...

MemoryBlock(MemoryBlock&& other) noexcept
   : _data(nullptr)
   , _length(0)
{
   _data = other._data;
   _length = other._length;
   other._data = nullptr;
   other._length = 0;
}

(I've omitted logging and redundant comments)

Why not just initialize in the initialization list instead of assignments within the constructor body? Like this:

MemoryBlock(MemoryBlock&& other) noexcept
   : _data(other._data)
   , _length(other._length)
{
   other._data = nullptr;
   other._length = 0;
}

I haven't bothered to see if the optimized code is equivalent or not. I would guess that a decent optimizer could make the above two constructors have identical object code. But the second is easier to read and easier to write, with no other downsides.

Aside

The referenced MSDN article has a copy assignment operator that does not meet the basic exception safety guarantee:

MemoryBlock& operator=(const MemoryBlock& other)
{
    if (this != &other)
    {
        // Free the existing resource.
        delete[] _data;

        _length = other._length;
        _data = new int[_length];
        std::copy(other._data, other._data + _length, _data);
    }
    return *this;
}

If _data = new int[_length]; throws an exception, then *this has a _data that points to a deleted block of memory, and a _length with the value other._length. Thus *this is in a state that does not meet the class invariants:

MemoryBlock invariants:

  1. _data is nullptr or points to a new int[_length]-allocated block of memory.
  2. If _data == nullptr then _length == 0.

In addition to this correctness bug, there is also a performance bug:

If _length == other._length, there is no need to go through the very expensive operation of deallocating a block of memory of length _length just to turn around and allocate a block of memory of length _length.

This is a more complicated copy assignment operator, but:

  1. It carries the basic exception safety guarantee. If an exception happens, the object is left in a valid state (class invariants hold).

  2. It avoids expensive trips to the heap at all costs. Branches are cheaper than trips to the heap, except for delete nullptr.

 

MemoryBlock& operator=(const MemoryBlock& other)
{
    if (this != &other)
    {
        if (_length != other._length)
        {
            // Free the existing resource.
            delete[] _data;
            // Set *this to a valid non-owning state
            data_ = nullptr;
            _length = 0;
            if (other._length > 0)
            {
                // Set *this to own a block of length other._length
                _data = new int[other._length];  // might throw
                _length = other._length;
            }
        }
        std::copy(other._data, other._data + _length, _data);
    }
    return *this;
}
Attired answered 15/6, 2013 at 0:23 Comment(9)
There is some good advice here but I think this answer neglects to take into account that the cost of duplicate code is more than just 5 minutes spent writing it a second time-- dup code has much more significant costs than that in readability, maintainability and robustness. Specifically when there is duplicate code, it invites a very common type of bug: that is, when something is fixed or changed in one instance of the code and not in the other. Having been bitten by that more times than I care to in my life, I would not duplicate code unless there is a very strong proven reason to.Jennee
Performance is one of the reasons but not the only or most important one. Most times when I write a move constructor & move assignment operator is because my class has unique ownership of a resource. Often performance is not an issue, so in these cases it makes sense to avoid duplication.Communicable
@user779446: Thanks for sharing your priorities. Here are mine: howardhinnant.github.io/coding_guidelines.htmlAttired
Donald Knuth's wrote: "Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%."Solvency
What I don't understand about this response is that the call to the move assignment will likely be inlined by the compiler and end up generating the same assembly as if you'd manually written both. In the vast majority of cases... there shouldn't be a performance penalty.Grafton
Well thanks for explaining your downvote. I respect that. Most people downvote silently. In the accepted answer the assembly for both techniques is helpfully displayed. Both are fully inlined and require 8 QWORD moves. That's pretty fast by any measure. But the one writes the move constructor in terms of the assignment includes a conditional branch. Branches are infamous at creating pipeline stalls. Admittedly branch prediction algorithms have gotten pretty good at reducing the cost of pipeline stalls. But when you hit one, the cost can be overwhelming compared to 8 moves.Attired
And then this is just for MemoryBlock. Be careful how this one example is used to generalize all code. It is easy to convince programmers that it is ok to always write code this way, without bothering to measure/test. If it were really universally ok, the = default version of the move constructor would be written to do this. But it's not.Attired
The resulted strategy is reasonable, but the reasons are insufficient. The root is far more subtle: even though C++'s as-if rules allows such optimization, it can't make this come true in general because the implementation has no way to reason about which copies are precisely redundant without additional information provided by users (even with current C++). Thus it is not a QoI issue of the language implementation (although it can be for the language specification). That's why we need move explicitly in the language.Flippant
Also note that the move constructor/operator (in the standard library, at least) should not be randomly specified. In the other word, the decision to support explicit move or not is about the design of the API. The Knuth's argument also can't defeat this concern.Flippant
B
4

My C++11 version of the MemoryBlock class.

#include <algorithm>
#include <vector>
// #include <stdio.h>

class MemoryBlock
{
 public:
  explicit MemoryBlock(size_t length)
    : length_(length),
      data_(new int[length])
  {
    // printf("allocating %zd\n", length);
  }

  ~MemoryBlock() noexcept
  {
    delete[] data_;
  }

  // copy constructor
  MemoryBlock(const MemoryBlock& rhs)
    : MemoryBlock(rhs.length_) // delegating to another ctor
  {
    std::copy(rhs.data_, rhs.data_ + length_, data_);
  }

  // move constructor
  MemoryBlock(MemoryBlock&& rhs) noexcept
    : length_(rhs.length_),
      data_(rhs.data_)
  {
    rhs.length_ = 0;
    rhs.data_ = nullptr;
  }

  // unifying assignment operator.
  // move assignment is not needed.
  MemoryBlock& operator=(MemoryBlock rhs) // yes, pass-by-value
  {
    swap(rhs);
    return *this;
  }

  size_t Length() const
  {
    return length_;
  }

  void swap(MemoryBlock& rhs)
  {
    std::swap(length_, rhs.length_);
    std::swap(data_, rhs.data_);
  }

 private:
  size_t length_;  // note, the prefix underscore is reserved.
  int*   data_;
};

int main()
{
   std::vector<MemoryBlock> v;
   // v.reserve(10);
   v.push_back(MemoryBlock(25));
   v.push_back(MemoryBlock(75));

   v.insert(v.begin() + 1, MemoryBlock(50));
}

With a correct C++11 compiler, MemoryBlock::MemoryBlock(size_t) should only be called 3 times in the test program.

Brooklime answered 10/10, 2013 at 21:43 Comment(4)
You should use std::unique_ptr<int[]> instead of manually deleting your array ; that would make it more c++11ishHun
I like swap() in the move assignment operator because this ensures that the destructor will be properly executed on the object's instance, thus releasing all its resources.Arv
Unifying assignment is acceptable when the data members do not occur overhead of a redundant copy on move. The overhead always exists here as per this analysis. Although such overhead is arguably small enough to ignore in this case, a suboptimal implementation is difficult to be idiomatic. Since the overhead may be not neglectable in some other cases, it is also difficult to get the criteria right (and portable). This can be a big issue in both API and ABI compatbility. Thus, I suggest to avoid unifying in special member functions at all.Flippant
In your case, the programmer still has to write two mostly similiar methods: swap() and move constructor (which is basically a specialization of swap() with empty/uninitialized lhs object). So, if a swap() method is required, anyways, your version is nice. If it is not needed, it doesn't avoid code duplication! Please note, that some recommendations go even further to use swap() also for the move constructor, by first creating an empty object via the default constructor and then "swapping in" the moved in object.Perspire
L
1

I don't think you're going to notice significant performance difference. I consider it good practice to use the move assignment operator from the move constructor.

However I would rather use std::forward instead of std::move because it's more logical:

*this = std::forward<MemoryBlock>(other);
Lazor answered 14/6, 2013 at 22:42 Comment(4)
std::forward is meant for use in templates, when you don't know whether you can move the object or not.Bannock
Isn't the "other" variable already an rvalue, so there is no need for std::move?Potaufeu
@MikeFisher Nope, a named rvalue is an lvalueHun
This worked well for me, and resolved a clang-tidy warning about using something after moving it. Thanks!Nonjuror
O
0

It depends what your move assignment operator does. If you look at the one in the article you linked to, you see in part:

  // Free the existing resource.
  delete[] _data;

So in this context, if you called the move assignment operator from the move constructor without initialising _data first, you would end up trying to delete an uninitialized pointer. So in this example, inefficient or not, it's actually crucial that you do initialize the values.

Opaque answered 15/6, 2013 at 7:25 Comment(2)
You wouldn't need to initialize the value twice if you used the initializer list :_data(other._data)Lowering
No but then you're re-implementing the move code, rather than calling the assignment operator which was what your question was about.Opaque
P
-2

I would simply eliminate member initialization and write,

MemoryBlock(MemoryBlock&& other)
{
   *this = std::move(other);
}

This will always work unless the move assignment throw exceptions, and it typically doesn't!

Advantages of this styles:

  1. You don't need to worry about whether the compiler would doubly initialize the members, because that may vary in different environments.
  2. You write less code.
  3. You don't need to update it even if you add extra members to the class in the future.
  4. Compilers can often inline the move assignment, so the overhead of copy constructor would be minimal.

I think @Howard's post does not quite answer this question. In practice, classes often don't like copying, a lot of classes simply disable copy constructor and copy assignment. But most classes can be moveable even if they are not copyable.

Plast answered 22/9, 2016 at 14:31 Comment(1)
Move-assignment operator may release this object's resources before stealing resources from the source object. If you don't initialize members in move constructor, then they will contain some garbage values and move-assignment operator will try to release them.Phylis

© 2022 - 2024 — McMap. All rights reserved.