libc++ implementation of std::condition_variable_any

Asked 29/10, 2017 at 18:14 Answered 21/10, 2023 at 4:20

c++multithreading c++11 thread-safety condition-variable

Condition variables should have have a single order with respect to notify() and unlock_sleep() (an imaginary function call used within wait() where the mutex is unlocked and the thread sleeps as one atomic sequence of operations) operations. To achieve this with arbitrary lockables std::condition_variable_any implementations typically use another mutex internally (to ensure atomicity and to sleep on)

If the internal unlock_sleep() and notify() (notify_one() or notify_all()) operations are not atomic with respect to each other you risk a thread unlocking the mutex, another thread signaling and then the original thread going to sleep and never waking up.

I was reading the libstdc++ and libc++ implementations of std::condition_variable_any and noticed this code in the libc++ implementation

{lock_guard<mutex> __lx(*__mut_);}
__cv_.notify_one();

the internal mutex is locked and then immediately unlocked before the signal operation. Doesn't this risk the problem I described above?

libstdc++ seems to have gotten this right

Furie answered 29/10, 2017 at 18:14 Comment(2)

I'm unfamiliar with unlock_sleep(). Can you provide a link to docs? Also, by notify() do you mean either of notify_one() or notify_all()? – Argentite 29/10, 2017 at 18:46

@MichaelBurr sorry I meant the general operation of going to sleep and unlocking the mutex as one atomic operation. I'll update the question – Furie 29/10, 2017 at 18:47

The C++11 and later standards explicitly say "The execution of notify_one and notify_all shall be atomic". So in one sense I think that you are correct that the internal mutex should be held across the call down to the platform's underlying condition variable notify call (for example pthread_cond_signal())

However, I don't think that the libc++ implementation will cause notifications to be missed because without the notifying thread synchronizing on the lock the waiting thread passes to wait() (or some other synchronization between the two threads) while calling notify_one() (or notify_all()) there is no way to ensure which of the two threads is 'first' to the notify or wait anyway. So if the notification can be missed in libc++'s current implementation, it could also be missed if libc++ were changed to hold the internal lock across its call down to the platform's notify API.

So I think that libc++ can invoke the "as if" rule to say that the implementation of notify_one()/notify_any() is "atomic enough".

Argentite answered 2/11, 2017 at 6:1 Comment(8)

> "if the notification can be missed in libc++'s current implementation, it could also be missed if libc++ were changed to hold the internal lock across its call down to the platform's notify API" I don't think so, if the internal lock was held. Then the whole sequence of going to sleep and unlocking the internal mutex would be atomic. If the internal lock was held then there would be no way for a notify to creep in right in the middle of unlocking the internal mutex and going to sleep (which is the problem I am asking about) – Furie 2/11, 2017 at 15:15

How are these two scenarios different: scenario 1) thread W's wait() acquires the internal mutex then thread N's notify_one() occurs (because it has already released the internal mutex) before thread W actually calls down to pthread_cond_wait(), and scenario 2) thread N's notify_one() call executes entirely just before thread W wait() call acquires the internal mutex? I believe that without some external synchronization the user of the API can't distinguish between those two situations. – Argentite 2/11, 2017 at 15:53

Why acquire the mutex in the first place in notify_one() then? – Furie 2/11, 2017 at 16:9

open-std.org/jtc1/sc22/wg21/docs/papers/2007/… this might explain it better than I can – Furie 2/11, 2017 at 16:14

I think that you're asking about the problem of missing a notification when the predicate has been updated (please correct me if I'm wrong). ThreadW needs to hold a lock while checking the predicate and continue holding that lock when calling wait() (if it decides it need to wait). ThreadN needs to hold that same lock when updating the predicate (though it doesn't need to hold the lock when calling notify_xxx()). But the internal mutex in condition_variable_any is not the lock that protects the predicate. So the internal mutex isn't what protects from missing a notify. – Argentite 2/11, 2017 at 17:55

My understanding is that the internal mutex is just something that condition_variable_any can use with the underlying platform API (since that API needs an actual mutex, while the thing that can be passed to condition_variable_any::wait() doesn't need to be a mutex - it just needs to be something that's "lockable". condition_variable_any::wait() uses the internal mutex as a surrogate when it calls the platform API for the lockable object that gets passed in. – Argentite 2/11, 2017 at 17:56

And I can't explain why the notify_xxx() call acquires/releases the internal mutex. Clearly I don't have a complete understanding of what's going on - which is not unusual when dealing with low level synchronization. – Argentite 2/11, 2017 at 17:56

I think I understand why the notify_xxx() functions acquire/release the mutex, and it is critical to preventing the missed notification scenario. However, it does still require that the the user of condition_variable_any properly use an external lock to protect the check and update of the predicate. I don't have time to write up my understanding right now - it'll have to wait until after work (hopefully it won't have evaporated from my brain by then...) – Argentite 2/11, 2017 at 18:41

[...] you risk a thread unlocking the mutex, another thread signaling and then the original thread going to sleep and never waking up.

This scenario indeed occurs if the internal unlock/wait¹ sequence is not atomic, as demonstrated in N2406. The solution, as outlined in the paper and also mentioned in your question, is straightforward: we could ensure that unlock/wait and notify operations (including notify_one and notify_all) are atomic with each other. This entails:

void notify_one(){
    unique_lock<mutex> internal(mut);
    cv.notify_one();
}

However, it appears that this solution is more encompassing than necessary. It suffices to ensure that unlock/wait is atomic with the notify operation, but the reverse is not essential, given that the internal notify operations are already atomic. To illustrate this point further, consider the following example²:

Thread A	Thread B
	`external.lock()`
	Change predicate
	`external.unlock()`
	Waking Thread A up
	Enter `condition_variable_any::notify_one`
	`unique_lock<mutex> internal(mut)`
`external.lock()`
Check predicate, decide to wait³
Enter `condition_variable_any::wait`
`unique_lock<mutex> internal(mut)`
	`cv.notify_one()` (1)
	`internal.~unique_lock<mutex>()` (2)
`external.unlock()`
`cv.wait(internal)`
	Exit `condition_variable_any::notify_one`
...	...

In this example, the order of evaluation between (1) and (2) is inconsequential because even if A is awakened by B, it will interpret it as a spurious wakeup and continue waiting, leading to essentially the same outcome. Specifically:

Thread A	Thread B
...	...
	`unique_lock<mutex> internal(mut)`
`external.lock()` (3)
Check predicate, decide to wait
Enter `condition_variable_any::wait`
`unique_lock<mutex> internal(mut)`
	`internal.~unique_lock<mutex>()` (2)
`external.unlock()`
`cv.wait(internal)`
	`cv.notify_one()` (1)
Awakened by Thread B
`internal.~unique_lock<mutex>()`
Jump to (3) above
...	...

As observed, releasing the mutex in notify functions prematurely poses the risk of a spurious wakeup. However, in practice, this has minimal impact as spurious wakeups are possible regardless. From a language-lawyer perspective, I believe libc++'s implementation is standard-conforming under the as-if rule. It's worth noting that in the above example, we attempt to notify after releasing the external lock. If we notify while holding the lock, there is minimal difference between libc++'s implementation and libstdc++'s implementation.

^{¹Assuming that we use an internal condition_variable to construct condition_variable_any.}
^{²Adapted and modified from N2406.}
^{³The predicate may have been altered by another thread due to a race condition.}

Astarte answered 21/10, 2023 at 4:20 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags