Why do some people use swap for move assignments?
Asked Answered
C

4

77

For example, stdlibc++ has the following:

unique_lock& operator=(unique_lock&& __u)
{
    if(_M_owns)
        unlock();
    unique_lock(std::move(__u)).swap(*this);
    __u._M_device = 0;
    __u._M_owns = false;
    return *this;
}

Why not just assign the two __u members to *this directly? Doesn't the swap imply that __u is assigned the *this members, only to later have then assigned 0 and false... in which case the swap is doing unnecessary work. What am I missing? (the unique_lock::swap just does an std::swap on each member)

Cashmere answered 14/7, 2011 at 1:3 Comment(0)
H
137

It's my fault. (half-kidding, half-not).

When I first showed example implementations of move assignment operators, I just used swap. Then some smart guy (I can't remember who) pointed out to me that the side effects of destructing the lhs prior to the assignment might be important (such as the unlock() in your example). So I stopped using swap for move assignment. But the history of using swap is still there and lingers on.

There's no reason to use swap in this example. It is less efficient than what you suggest. Indeed, in libc++, I do exactly what you suggest:

unique_lock& operator=(unique_lock&& __u)
    {
        if (__owns_)
            __m_->unlock();
        __m_ = __u.__m_;
        __owns_ = __u.__owns_;
        __u.__m_ = nullptr;
        __u.__owns_ = false;
        return *this;
    }

In general a move assignment operator should:

  1. Destroy visible resources (though maybe save implementation detail resources).
  2. Move assign all bases and members.
  3. If the move assignment of bases and members didn't make the rhs resource-less, then make it so.

Like so:

unique_lock& operator=(unique_lock&& __u)
    {
        // 1. Destroy visible resources
        if (__owns_)
            __m_->unlock();
        // 2. Move assign all bases and members.
        __m_ = __u.__m_;
        __owns_ = __u.__owns_;
        // 3. If the move assignment of bases and members didn't,
        //           make the rhs resource-less, then make it so.
        __u.__m_ = nullptr;
        __u.__owns_ = false;
        return *this;
    }

Update

In comments there's a followup question about how to handle move constructors. I started to answer there (in comments), but formatting and length constraints make it difficult to create a clear response. Thus I'm putting my response here.

The question is: What's the best pattern for creating a move constructor? Delegate to the default constructor and then swap? This has the advantage of reducing code duplication.

My response is: I think the most important take-away is that programmers should be leery of following patterns without thought. There may be some classes where implementing a move constructor as default+swap is exactly the right answer. The class may be big and complicated. The A(A&&) = default; may do the wrong thing. I think it is important to consider all of your choices for each class.

Let's take a look at the OP's example in detail: std::unique_lock(unique_lock&&).

Observations:

A. This class is fairly simple. It has two data members:

mutex_type* __m_;
bool __owns_;

B. This class is in a general purpose library, to be used by an unknown number of clients. In such a situation, performance concerns are a high priority. We don't know if our clients are going to be using this class in performance critical code or not. So we have to assume they are.

C. The move constructor for this class is going to consist of a small number of loads and stores, no matter what. So a good way to look at the performance is to count loads and stores. For example if you do something with 4 stores, and somebody else does the same thing with only 2 stores, both of your implementations are very fast. But their's is twice as fast as yours! That difference could be critical in some client's tight loop.

First lets count loads and stores in the default constructor, and in the member swap function:

// 2 stores
unique_lock()
    : __m_(nullptr),
      __owns_(false)
{
}

// 4 stores, 4 loads
void swap(unique_lock& __u)
{
    std::swap(__m_, __u.__m_);
    std::swap(__owns_, __u.__owns_);
}

Now lets implement the move constructor two ways:

// 4 stores, 2 loads
unique_lock(unique_lock&& __u)
    : __m_(__u.__m_),
      __owns_(__u.__owns_)
{
    __u.__m_ = nullptr;
    __u.__owns_ = false;
}

// 6 stores, 4 loads
unique_lock(unique_lock&& __u)
    : unique_lock()
{
    swap(__u);
}

The first way looks much more complicated than the second. And the source code is larger, and somewhat duplicating code we might have already written elsewhere (say in the move assignment operator). That means there's more chances for bugs.

The second way is simpler and reuses code we've already written. Thus less chance of bugs.

The first way is faster. If the cost of loads and stores is approximately the same, perhaps 66% faster!

This is a classic engineering tradeoff. There is no free lunch. And engineers are never relieved of the burden of having to make decisions about tradeoffs. The minute one does, planes start falling out of the air and nuclear plants start melting down.

For libc++, I chose the faster solution. My rationale is that for this class, I better get it right no matter what; the class is simple enough that my chances of getting it right are high; and my clients are going to value performance. I might well come to another conclusion for a different class in a different context.

Halliday answered 14/7, 2011 at 1:29 Comment(11)
Thanks for the informative reply. In some cases I have seen the individual member assignments use a form of lhs = move(rhs), and while it's not an issue for these built-in types, is the move() useful in any case, or can one assume that the compiler would always use the member type's move assignment? Also, can you address the exception safety that Kerrek SB mentioned below? Thanks again.Cashmere
The move assignment must be explicitly used in general. I didn't use it here basically out of laziness. It could have been used and the code generation should be the same. And in generic code, or the case where the members/bases aren't scalars, it is absolutely necessary. Exception safety: The best designed move assignment operators are noexcept. A move assignment operator doesn't have to create resources, only transfer them, so noexcept is a realistic goal. The same can be said for swap. Of course there are exceptions to the rule (the world is never perfect or neat).Halliday
And I'd like to add: both swap and move assignment are type primitives. They do slightly different things and should be as efficient as possible so that your clients can build upon them without paying a performance penalty. If you can build one out of the other without a performance penalty that's great. If you can't, but still insist on building one out of the other, that's just laziness and your clients pay the price. Make your objects as high quality as you can. And let your clients judge what is high quality and what is not.Halliday
Ok, better to implement Move-Assign T::operator=(T&&) without swap. Thanks for the good explanation. But What about the Move T::T(T&&)?Colossians
@towi: For move construction I generally don't use swap. One would have to set *this to some valid state and then swap it to rhs. One could just as easily set *this to the state of rhs and then set rhs to some valid state. This is often accomplished by simply move constructing each base and member, and then fixing up the rhs.Halliday
@Howard: The only worry I have is, that it introduces code duplication for constructing the members one-by-one. I'd have thought towards a delegating constructor to set *this to a default state and then use one single swap(*this,rhs). Listing all members is an "easy brakable part" when someone adds a member. That's what I was worrying about. So, I guess I was wondering if this should be centralized in one swap implementation that can handle &&-swapping, and then used for Move and Move-Assign alike.Colossians
@Howard: Thank you Howard. You are right of course: Idiomatic first, performance when it matters -- and here it matters, in the STL every tick counts. I will also take your examples with the two alternatives into my heart. If you would not have so many points already, I would worry about not being able to grant you additional ones for the merged answer ;-)Colossians
The move assignments here and in the query, at the time of this writing, are self-assignment-suicidal. You may want to proof against that, unless you like clients swearing at you after a 5-hour debugging session for why a loop only occasionally crashes. (Aliasing could make both sides of a move-assignment refer to the same object. No worries if at least one side is a truly temporary object, though.)Logistician
Why destroy visible resources? Doesn't that make move-assignment more costly, and lowers safe-move-assignment from harmless to spoiling? Also, are current optimizing compilers really / still incapable of optimizing your two ways to write unique_lock(unique_lock&&) to the same optimal result?Nobleminded
When resources are not just memory, but something like mutexes or streams, the semantics are wrong with swap: This is LWG issue 675: cplusplus.github.io/LWG/lwg-defects.html#675Halliday
I'm not convinced the semantics actually are wrong. Also, it seems to penalize the common case (don't care whether the moved-to objects resources survive until the moved-from object is destroyed or re-used) for the rare case (It actually makes a semantic difference, and I wanted that destroyed). I'll have to Keep pondering it a bit. Anyway, it made it into C++17, so it's a done deal.Nobleminded
C
10

It's about exception safety. Since __u is already constructed when the operator is called, we know there's no exception, and swap doesn't throw.

If you did the member assignments manually, you'd risk that each of those might throw an exception, and then you'd have to deal with having partially move-assigned something but having to bail out.

Maybe in this trivial example this doesn't show, but it's a general design principle:

  • Copy-assign by copy-construct and swap.
  • Move-assign by move-construct and swap.
  • Write + in terms of construct and +=, etc.

Basically, you try to minimize the amount of "real" code and try to express as many other features in terms of the core features as you can.

(The unique_ptr takes an explicit rvalue reference in the assignment because it does not permit copy construction/assignment, so it's not the best example of this design principle.)

Cruciform answered 14/7, 2011 at 1:6 Comment(0)
M
2

Another thing to consider regarding the trade-off:

The default-construct + swap implementation may appear slower, but -sometimes- data flow analysis in the compiler can eliminate some pointless assignments and end up very similar to handwritten code. This works only for types without "clever" value semantics. As an example,

 struct Dummy
 {
     Dummy(): x(0), y(0) {} // suppose we require default 0 on these
     Dummy(Dummy&& other): x(0), y(0)
     {
         swap(other);             
     }

     void swap(Dummy& other)
     {
         std::swap(x, other.x);
         std::swap(y, other.y);
         text.swap(other.text);
     }

     int x, y;
     std::string text;
 }

generated code in move ctor without optimization:

 <inline std::string() default ctor>
 x = 0;
 y = 0;
 temp = x;
 x = other.x;
 other.x = temp;
 temp = y;
 y = other.y;
 other.y = temp;
 <inline impl of text.swap(other.text)>

This looks awful, but data flow analysis can determine it is equivalent to the code:

 x = other.x;
 other.x = 0;
 y = other.y;
 other.y = 0;
 <overwrite this->text with other.text, set other.text to default>

Maybe in practice compilers won't always produce the optimal version. Might want to experiment with it and take a glance at the assembly.

There are also cases when swapping is better than assigning because of "clever" value semantics, for example if one of the members in the class is a std::shared_ptr. No reason a move constructor should mess with the atomic refcounter.

Myrmecophagous answered 3/3, 2013 at 22:15 Comment(1)
Yes, the compiler may be smart enough to optimize this in some cases. But if you're aiming for performance and portability both I suggest you don't rely on it. Furthermore, the shared_ptr example you give isn't a good one as it already implements move construction and move assignment: just use those operations.Cortege
L
2

I will answer the question from header: "Why do some people use swap for move assignments?".

The primary reason to use swap is providing noexcept move assignment.

From Howard Hinnant's comment:

In general a move assignment operator should:

  1. Destroy visible resources (though maybe save implementation detail resources).

But in general destroy/release function can fail and throw exception!

Here is an example:

class unix_fd
{
    int fd;

public:
    explicit unix_fd(int f = -1) : fd(f) {}

    ~unix_fd()
    {
        if(fd == -1) return;
        if(::close(fd)) /* !!! call is failed! But we can't throw from destructor so just silently ignore....*/;
    }

    void close() // Our release-function
    {
        if(::close(fd)) throw system_error_with_errno_code;
    }
};

Now let's compare two implementaions of move-assignment:

// #1
unix_fd &unix_fd::operator=(unix_fd &&o) // Can't be noexcept
{
    if(&o != this)
    {
        close(); // !!! Can throw here
        fd = o.fd;
        o.fd = -1;
    }
    return *this;
}

and

// #2
unix_fd &unix_fd::operator=(unix_fd &&o) noexcept
{
    std::swap(fd, o.fd);
    return *this;
}

#2 is perfectly noexcept!

Yes, close() call can be "delayed" in case #2. But! If we want strict error checking we must use explicit close() call, not destructor. Destructor releases resource only in "emergency" situations, where exeption can't be thrown anyway.

P.S. See also discussion here in comments

Lessard answered 24/11, 2015 at 12:39 Comment(4)
Your example is an example of how using swap can lead to confusing problems. After a move-assignment one would expect the old value to be gone and the file closed after the move, but with this it will stay open until the end of the original block.Commination
You mustn't expect that! The Standartd clearly states that after move the value is in some valid but unspecified state. Instead call close() explicitly beforehandLessard
It's like expecting an operator+ to add things together. It may not be in the Standard, but it is expected and polite behavior to destroy the thing you assign to.Commination
"The Standartd clearly states that after move the value is in some valid but unspecified state" ... But that holds only for standard library types. one should not expect user-defined types to follow the same rules as it is very arguable whether the standard library design is the best one for any given case.Troy

© 2022 - 2024 — McMap. All rights reserved.