Assignment via copy-and-swap vs two locks
Asked Answered
S

1

6

Borrowing Howard Hinnant's example and modifying it to use copy-and-swap, is this op= thread-safe?

struct A {
  A() = default;
  A(A const &x);  // Assume implements correct locking and copying.

  A& operator=(A x) {
    std::lock_guard<std::mutex> lock_data (_mut);
    using std::swap;
    swap(_data, x._data);
    return *this;
  }

private:
  mutable std::mutex _mut;
  std::vector<double> _data;
};

I believe this thread-safe (remember op='s parameter is passed by value), and the only problem I can find is the one swept under the rug: the copy ctor. However, it would be a rare class that allows copy-assignment but not copy-construction, so that problem exists equally in both alternatives.

Given that self-assignment is so rare (at least for this example) that I don't mind an extra copy if it happens, consider the potential optimization of this != &rhs to be either negligible or a pessimization. Would there be any other reason to prefer or avoid it compared to the original strategy (below)?

A& operator=(A const &rhs) {
  if (this != &rhs) {
    std::unique_lock<std::mutex> lhs_lock(    _mut, std::defer_lock);
    std::unique_lock<std::mutex> rhs_lock(rhs._mut, std::defer_lock);
    std::lock(lhs_lock, rhs_lock);
    _data = rhs._data;
  }
  return *this;
}

Incidentally, I think this succinctly handles the copy ctor, at least for this class, even if it is a bit obtuse:

A(A const &x) : _data {(std::lock_guard<std::mutex>(x._mut), x._data)} {}
Slightly answered 21/2, 2011 at 22:36 Comment(2)
An answer prompted me to ask this question.Slightly
As the original writer of the linked answer, and having come to the same decision (using pass by value to avoid double locking) I feel that I have to state a crucial difference between the two approaches, which happens to be too long for a comment, so I am editing the answer.Downcomer
U
8

I believe your assignment is thread safe (assuming of course no references outside the class). The performance of it relative to the const A& variant probably depends on A. I think for many A that your rewrite will be just as fast if not faster. The big counter-example I have is std::vector (and classes like it).

std::vector has a capacity that does not participate in its value. And if the lhs has sufficient capacity relative to the rhs, then reusing that capacity, instead of throwing it away to a temp, can be a performance win.

For example:

std::vector<int> v1(5);
std::vector<int> v2(4);
...
v1 = v2;

In the above example, if v1 keeps its capacity to do the assignment, then the assignment can be done with no heap allocation or deallocation. But if vector uses the swap idiom, then it does one allocation and one deallocation.

I note that as far as thread safety goes, both algorithms lock/unlock two locks. Though the swap variant avoids the need to lock both of them at the same time. I believe on average the cost to lock both at the same time is small. But in heavily contested use cases it could become a concern.

Urogenous answered 21/2, 2011 at 23:15 Comment(19)
I'm confused concerning the "just as fast" part, I thought that the whole idea of the copy-and-swap idiom was based on the copy elision of the parameter, wouldn't that fail in this case? I mean, when the copy-ctor is locking, I would be surprised if the compiler could elide the copy, thus always making a true copy of the object we want to copy. And when a true on-heap copy is made, it is likely to be always slower than the initial assignment. Or am I wrong?Hollyanne
The copy can still be elided if the argument to the copy assignment operator is an rvalue. And it is still thread safe! :-) When you have a reference to an rvalue, you are assured that nobody else, not even another thread, has a reference to the same rvalue. Thus it is safe to do whatever you want to that rvalue. The one exception to what I say is if someone casts an lvalue to an rvalue. But in that case, it is the caster's responsibility to ensure that the program can truly treat the casted lvalue as an rvalue, even in a multi-threaded environment.Urogenous
Yes, by "two locks" in the title I meant at the same time (which looks to be a common source of problems, similar to how self-assignment was often overlooked?). I was thinking about rvalue-refs and other such things (even though there's no move constructor above) and completely missed existing capacity. Thanks.Slightly
If it helps, here is an implementation of locking two locks at the same time: llvm.org/svn/llvm-project/libcxx/trunk/include/mutex search for "template <class _L0, class _L1> void lock(_L0& __l0, _L1& __l1) " (I dread how this will be formatted in a comment). This is open source code so feel free to use it, but please keep the copyright with the source wherever you do (it is much more generous than GPL3).Urogenous
@Howard, thanks for the explanation! I realized that the reason of my confusion was actually the existence of shared data (access to which is locked in the copy-ctor), which is not an rvalue even if the object itself is. I forgot that this, however, "does not count" and copies can still be elided even if it could modify the program behavior (this part has always been tricky for me).Hollyanne
@Howard, but now I'm curious - doesn't what you are saying mean that copy-and-swap will always be inefficient when some data members (or members of members etc) are std::vector-likes, regardless of the multi-threading? Which finally means that copy-and-swap shouldn't be used when this pessimisation is potentially possible, i.e. in quite many cases?Hollyanne
@7vies: Your conclusion is partially correct. It is up to the author of each class to determine the best algorithm for copy assignment. Sometimes that will be copy-construct-and-swap and sometimes it won't. The presence of vector-like members may impact this decision, and it may not. It will depend upon all of the members/bases, and may also depend upon the most likely use cases of the copy assignment operator. There's just no getting around performance testing. :-)Urogenous
And it may also depend upon the implementation details of the vector-like implementation you're using. :-( Sorry, no easy answer you can take to the bank. Except maybe: patterns are your tools, not your master.Urogenous
I see.. as I remember copy-and-swap was presented like a silver bullet sometimes, now I'll know that it actually depends. Thanks again!Hollyanne
@7vies: While performance may matter, I usually implement assignment as copy-and-swap because it's simpler. Since it's an implementation detail, I can always come back and modify it if necessary, but in the mean time at least my code is correct.Fredia
@Matthieu: Yep, it's a particular case of the common rule "first KISS then optimize if needed". Still important to know the consequences, though.Hollyanne
@7vies... I guess it is some short of silver bullet, not to be taken as a golden hammer. On the effect of ignoring the capacity, it might have a positive or negative impact, consider that the lhs of the assignment contained a few million elements, but the rhs contained only 2, by not swapping the vector will still keep all the memory even if almost all unused. The best advice, quoting @Howard is: There's just no getting around performance testing If you really care about performance, you have to really work on the specific problem.Downcomer
BTW, something that is missing from this answer is that the semantics of locking both mutexes at once vs. copy-and-swap are not exactly the same, and there is a potential race condition in the copy-and-swap (that you might well not care about), as the rhs can be changed between the time it is copied to the argument and the time the lock on the lhs is acquired. Locking both mutexes ensures that at one point in time both lhs and rhs have the same value, copy-and-swap offers the lesser guarantee that after the assignment the lhs has a value that the rhs had at one particular point in time.Downcomer
If in your application "thread safety" depends only on consistency of each datum, then both are thread safe. One applications for this could be multiple threads performing high throughput operations, one thread gathering statistics: you might not care whether the statistics are an exact snapshot of the stats in all threads, as by the time the user gets the values they will have changed, the result is only an estimate. But it is important that the stats read from each thread are consistent, in particular, in a 32bit box, if the stats is a 64bit int, not locking might cause a 2^32 error!Downcomer
@DavidRodríguez-dribeas: It is a slightly different guarantee, but it does not appear "lesser" is accurate.Slightly
@Fred, I guess that is up to discussion, but one guarantees that the copy and the original will have the same value at some particular time, and the other that the copy will have the value that the original had at some time. The first of which si thus stronger: it guarantees both consistency with a value the source had, and when that consistency happens. The other seems lesser to me as while it ensures that the copy has no junk, it does not at the same time guarantee consistency in a particular instant in time. But again, that is up to discussion. For sure they are different guarantees.Downcomer
@DavidRodríguez-dribeas: Doesn't copy-and-swap also define "when that consistency happens"? Namely, the value when the assignment is "started" (since we're not treating assignment as atomic). In neither case could you ensure, for example: a = b; assert(a == b);Slightly
@Fred Nurk: With the copy and swap idiom, you cannot assert that, with the double lock mechanism you can assert it before releasing the lock. I think I provided a code sample where the behavior of both could be seen to differ, I could look it up, but it might be faster to rewrite it. A thread is performing a = b; in a tight loop, the other thread is running: std::lock( a.m, b.m ); ++a; ++b; assert( equals_non_locking(a,b) ); In the copy-and-swap version, you will eventually hit a case where the copy is performed before the increments, and then a will lag behind: the assert will fail.Downcomer
@DavidRodríguez-dribeas: But the lock is held inside op=, so you cannot ensure a = b; assert(a == b); from any other code. Your example is incomplete and I don't see (once I complete it as appears reasonable) how the CaS version could fail the assert; could you complete it? (In particular, it appears you mean to keep both a.m and b.m locked throughout the increments and assert. Because they are both locked, the other thread cannot do anything with 'a' or 'b' within that region.)Slightly

© 2022 - 2024 — McMap. All rights reserved.