unlock the mutex after condition_variable::notify_all() or before?
Asked Answered
L

6

8

Looking at several videos and the documentation example, we unlock the mutex before calling the notify_all(). Will it be better to instead call it after?

The common way:

Inside the Notifier thread:

//prepare data for several worker-threads;

//and now, awaken the threads:
std::unique_lock<std::mutex> lock2(sharedMutex);
_threadsCanAwaken = true;

lock2.unlock(); 
_conditionVar.notify_all(); //awaken all the worker threads;

//wait until all threads completed;

//cleanup:
_threadsCanAwaken = false;

//prepare new batches once again, etc, etc

Inside one of the worker threads:

while(true){
    // wait for the next batch:

    std::unique_lock<std::mutex> lock1(sharedMutex);
    _conditionVar.wait(lock1,  [](){return _threadsCanAwaken});
    lock1.unlock(); //let sibling worker-threads work on their part as well

    //perform the final task

    //signal the notifier that one more thread has completed;

    //loop back and wait until the next task
}

Notice how the lock2 is unlocked before we notify the condition variable - should we instead unlock it after the notify_all() ?

Edit

From my comment below: My concern is that, what if the worker spuriously awakes, sees that the mutex is unlocked, super-quickly completes the task and loops back to the start of while. Now the slow-poke Notifier finally calls notify_all(), causing the worker to loop an additional time (excessive and undesired).

Lux answered 25/9, 2018 at 17:1 Comment(2)
The destructor call of lock2 will automatically unlock the mutex after the _conditionVar.notify_all(); so you don't need to call it explicitly at all, that's the common idiom IIRC. BTW, don't use a prefix underscore for any of your code, this is reserved for compiler and standard library implementations.Peres
Thanks! My concearn is that, what if consumer is super quick. Consumer spruriously awakes, sees that the mutex is unlocked, completes the task and loops back to the start of while. Now the slow-poke Producer finally calls notify_all(), causing consumer to loop an additional time.Lux
A
8

There are no advantages to unlocking the mutex before signaling the condition variable unless your implementation is unusual. There are two disadvantages to unlocking before signaling:

  1. If you unlock before you signal, the signal may wake a thread that choose to block on the condition variable after you unlocked. This can lead to a deadlock if you use the same condition variable to signal more than one logical condition. This kind of bug is hard to create, hard to diagnose, and hard to understand. It is trivially avoided by always signaling before unlocking. This ensures that the change of shared state and the signal are an atomic operation and that race conditions and deadlocks are impossible.

  2. There is a performance penalty for unlocking before signaling that is avoided by unlocking after signaling. If you signal before you unlock, a good implementation will know that your signal cannot possibly render any thread ready-to-run because the mutex is held by the calling thread and any thread affects by the condition variable necessarily cannot make forward progress without the mutex. This permits a significant optimization (often called "wait morphing") that is not possible if you unlock first.

So signal while holding the lock unless you have some unusual reason to do otherwise.

Armored answered 11/2, 2021 at 20:45 Comment(12)
this is a bit at odds with the common advice that signaling before unlocking is a pessimization. Could you elaborate and counter this claim in particular? The 1. that you mention seems like a good safety measure, but I don't understand the performance penalty you mention in 2. I can imagine that signal -> lock can be optimized, but what exactly is the penalty that occurs from unlocking first?Meiny
If you unlock first, the unlock can make a thread ready-to-run and the signal can make a thread ready-to-run. If you unlock after signalling, the signal cannot possibly make any thread ready-to-run because any thread blocked on the signal would still need to acquire the mutex to make forward progress and the calling thread has the mutex. This makes the signal operation cheaper because a significant part of what it would otherwise have to do can be skipped. Essentially, the penalty is that the wait morphing optimization cannot take place.Armored
I don't see how the deadlock you mention in 1. is avoided if you signal before unlocking. The scheduler could always decide to start another thread waiting for the same lock, even if it's not waiting on the condition variable. If your program is subject to 1., then you just have a bug, independently on whether you signal then unlock or the other way around.Catabolite
@GiovanniMascellani If you signal before unlocking the mutex, then the signal can only wake a thread that chose to block before it saw the change the signaling thread made to the shared state. If you unlock the mutex first, you make wake a thread that just chose to block on the condition variable after seeing your change, a thread that will definitely just go back to sleep.Armored
@Meiny I think you are right. The unlock-notify sequence does not require any special optimizations, while the notify-unlock sequence requires wait morphing to be efficient. And wait morphing currently does not exist on Linux. See my anwer for more info.Revell
@DavidSchwartz After digesting your 2022-11-02 comment, I understood what deadlock you're talking about. Although specifying a slightly different API, POSIX correctly states that the wait call should be wrapped around a loop that checks for the condition, and wait again only if the condition is not satisfied (unless for example error occured, which itself may be considered as part of the condition in some cases). The same recommendations are made elsewhere for resolving such so-called deadlock (I find "sleep paralysis" a better term for condvars).Appellant
@Appellant That doesn't fix the deadlock. You can still have a case where a a thread is blocked on a condition variable, the state changes, another thread blocks on the condition variable, then the condition variable is signaled and the second thread wakes. The second thread just goes back to sleep and the first thread never gets woken up. That's the deadlock I'm talking about. It can only occur if you use the same condition variable with more than one predicate (for example, "queue full" and "queue empty") and if you signal rather than broadcast.Armored
@DavidSchwartz Yes, that's also the deadlock I'm referring to as sleep paralysis. 1 condvar must be associated with exactly 1 condition (may be complex but only exact 1 condition), and thus with exactly 1 mutex. What I'm objecting, is that signally before or after unlock doesn't matter; it's the testing of condition before calling wait that solves it; the definition of condvar wait and signal and mutex unlock doesn't conflict with each other, and it's perfectly valid to signal 2 different condvars associated with 1 mutex (David Butenhof refer to these "big" mutices in his book on Pthread).Appellant
@Appellant It is perfectly safe to associate one condition variable with more than one condition (for example, "queue full" and "queue empty") so long as you either broadcast the condition variable rather than signalling it or signal while holding the mutex. Only if you do all three things can you deadlock: 1) One c.v. for more than one condition. 2) You signal the c.v. and not broadcast. 3) You signal after releasing the mutex. You just can't do all three of those things.Armored
@DavidSchwartz Sure. But for 1) that "queue full" and "queue empty" condition can be combined into "queue uncontinuable state change" - reader cannot continue when queue is empty and writer cannot continue when queue is full. For 2) signal is a special "optimized" substitute for broadcast usable when only 1 other thread uses the CV (that is, broadcast should be the default choice). For 3) I'm uncertain, but that feels a bit like cargo cult programming, maybe you can explain how 3) avoids problem caused by 1) and 2) by editing the answer?Appellant
@Appellant 3 avoids the problem because there is no window for the thread to block on the condition variable. If you change the predicate, unlock the mutex, and then signal the condition variable, you might wake a thread that went to sleep after you unlocked the mutex and that thread will go right back to sleep causing a missed wakeup.Armored
Let us continue this discussion in chat.Appellant
M
3

should we instead unlock it after the notify_all() ?

It is correct to do it either way but you may have different behavior in different situations. It is quite difficult to predict how it will affect performance of your program - I've seen both positive and negative effects for different applications. So it is better you profile your program and make decision on your particular situation based on profiling.

Marcel answered 25/9, 2018 at 17:56 Comment(2)
Did I understand correctly, notify_all() releases the mutex from the currently held lock? To me it's not very clear, because notify_all() doesn't accept our current lock as the argument (unlike wait). I know that our lock will be destroyed when the local scope ends, but not immediately after notify_all ?Lux
No neither notify_all nor notify_one unlocks the mutex, it is your job either manually or through RAII mechanism. You would just have different behaviour if you unlock before or after notify (I mean by timing).Marcel
D
1

As mentioned here : cppreference.com

The notifying thread does not need to hold the lock on the same mutex as the one held by the waiting thread(s); in fact doing so is a pessimization, since the notified thread would immediately block again, waiting for the notifying thread to release the lock.

That said, documentation for wait

At the moment of blocking the thread, the function automatically calls lck.unlock(), allowing other locked threads to continue.

Once notified (explicitly, by some other thread), the function unblocks and calls lck.lock(), leaving lck in the same state as when the function was called. Then the function returns (notice that this last mutex locking may block again the thread before returning).

so when notified wait will re-attempt to gain the lock and in that process it will get blocked again till original notifying thread releases the lock. So I'll suggest that release the lock before calling notify. As done in example on cppreference.com and most importantly

Don't be Pessimistic.

Drill answered 28/7, 2019 at 11:57 Comment(4)
The statement at cppreference.com is wrong. The notified thread would not immediately block again because it would never become ready-to-run. Any sensible implementation would know that the thread can't make forward progress because the mutex is held by the calling thread. (See my answer.)Armored
@DavidSchwartz Except that the "sensible implementation" does not exist? (See my answer.)Revell
@YongweiWu I wouldn't write code arouind the quirks of any particular implementation, even if it was the only one I was targeting, unless there was some significant demonstrated need to do so.Armored
@DavidSchwartz I wouldn't write code around the quirks of some special corner cases. In some cases there is no reason to even lock on the notifier side. Simpler code is better, and it does not pessimize.Revell
R
1

David's answer seems to me wrong.

First, assuming the simple case of two threads, one waiting for the other on a condition variable, unlocking first by the notifier will not waken the other waiting thread, as the signal has not arrived. Then the notify call will immediately waken the waiting thread. You do not need any special optimizations.

On the other hand, signalling first has the potential of waking up a thread and making it sleep immediately again, as it cannot hold the lock—unless wait morphing is implemented.

Wait morphing does not exist in Linux at least, according to the answer under this StackOverflow question: Which OS / platforms implement wait morphing optimization?

The cppreference example also unlocks first before signalling: https://en.cppreference.com/w/cpp/thread/condition_variable/notify_all

It explicit says:

The notifying thread does not need to hold the lock on the same mutex as the one held by the waiting thread(s). Doing so may be a pessimization, since the notified thread would immediately block again, waiting for the notifying thread to release the lock, though some implementations recognize the pattern and do not attempt to wake up the thread that is notified under lock.

Revell answered 2/12, 2022 at 3:58 Comment(2)
"unlocking first by the notifier will not waken the other waiting thread, as the signal has not arrived" - do you know that no implementation can cause cv to wake up when mutexes are unlocked? The C++ standard gives almost no guarantees about spurious wakeups, and you seem to be making a positive claim here; do you have evidence for it?Pittance
@Yakk-AdamNevraumont I am not talking about spurious wakeup or lost wakeup here, which is a separate issue about correctness. I do encounter some cases where unlocking after signalling is necessary for correctness, but in other cases doing so is just performance pessimization. And I never said unlocking could wake up the waiting thread. The waiting thread should always check for a condition before continuing, so I am only concerned about the cases where a lost wakeup can cause issues.Revell
A
0

should we instead unlock it after the notify_all() ?

After reading several related posts, I've formed the opinion that it's purely a performance issue. If OS supports "wait morphing", unlock after; otherwise, unlock before.

I'm adding an answer here to augment that of @DavidSchwartz 's. Particularly, I'd like to clarify his point 1.

If you unlock before you signal, the signal may wake a thread that choose to block on the condition variable after you unlocked. This can lead to a deadlock if you use the same condition variable to signal more than one logical condition. This kind of bug is hard to create, hard to diagnose, and hard to understand. It is trivially avoided by always signaling before unlocking. This ensures that the change of shared state and the signal are an atomic operation and that race conditions and deadlocks are impossible.

  1. The 1st thing I said is that, because it's a CV and not a Mutex, a better term for the so-called "deadlock" might be "sleep paralysis" - a mistake some programs make is that

    • a thread that's supposed to wake

    • went to sleep due to not rechecking the condition it's been waiting for before wait'ng again.

  2. The 2nd thing is that, when waking some other thread(s),

    • the default choice should be broadcast/notify_all (broadcast is the POSIX term, which is equivalent to its C++ counterpart).

    • signal/notify is an optimized special case used for when there's only 1 other thread is waiting.

  3. Finally 3rd, David is adamant that

    • it's better to unlock after notify,

    • because it can avoid the "deadlock" which I've been referring to as "sleep paralysis".

If it's unlock then notify, then there's a window where another thread (let's call this the "wrong" thread) may i.) acquire the mutex, ii.)going into wait, and iii.) wake up. The steps i. ii. and iii. happens too quickly, consumed the signal, leaving the intended (let's call it "correct") thread in sleep.

I discussed this extensively with David, he clarified that only when all 3 points are violated ( 1. condvar associated with several separate conditions and/or didn't check it before waiting again; 2. signal/notify only 1 thread when there're more than 1 other threads using the condvar; 3. unlock before notify creating a window for race condition ), the "sleep paralysis" would occur.

Finally, my recommendation is that, point 1 and 2 are essential for correctness of the program, and fixing issues associated with 1 and 2 should be prioritized over 3, which should only be a augmentative "last resort".

For the purpose of providing reference, manpage for signal/broadcast and wait contains some info from version 3 of Single Unix Specification that gave some explanations on point 1 and 2, and partly 3. Although specified for POSIX/Unix/Linux in C, it's concepts are applicable to C++.

As of this writing (2023-01-31), the 2018 edition of version 4 of Single Unix Specification is released, and the drafting of version 5 is underway.

Appellant answered 31/1, 2023 at 2:21 Comment(0)
B
0

I am new to C++ threading and had to do some research online for using condition variables. I found the discussion between @DavidSchwartz and @DannyNiu interesting. Since I was not allowed to add a comment to that discussion, I am just offering here another relevant reference from this CppCon video @36:18.
Back to Basics: Concurrency - Arthur O'Dwyer - CppCon 2020

void returnToken(Token t) {
   std::unique_lock lk(mtx_);
   tokens_.push_back(t);
   lk.unlock();
   cv_.notify_one();
}

Here Arthur O'Dwyer says:

Why isn't it a lock guard? It actually could be a lock guard. The reason I made it a unique lock is so that I could manually unlock it here before doing the notify. In previous presentations, I would sometimes reverse those lines and do the notify before the unlock. Semantically, it doesn't matter; but for performance, people keep telling me that it is a good idea to do the unlock before the notify.

Another article also shows unlocking before the notify.
C++ Core Guidelines: Be Aware of the Traps of Condition Variables
The main point of the that article is: Don’t wait without a condition, which is off topic.

Boonie answered 4/10, 2023 at 4:30 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.