std::atomic::notify_one could unblock multiple threads
Asked Answered
E

2

19

According to cppreference, std::atomic<T>::notify_one() will notify at least one thread that is waiting on said atomic. This means that according to the standard it could unblock more than one thread. This is in contrast to std::condition_variable::notify_one(), which specifies that it will unblock (no more than) one thread.

Where does this difference come from? Does this not use the same underlying mechanism? As far as implementations of the standard library go, do all of the prevalent ones have a chance of actually unblocking multiple with this call, or are there some that always unblock exactly one?

Eskimo answered 9/6, 2022 at 20:48 Comment(2)
One valid implementation strategy for std::atomic::wait() is a naive spin-wait loop that just keeps reading the variable. If that's what any of your waiters are doing (either temporarily before calling futex, or just on a simplistic implementation), they'll wake up on their own. And notify_one() would be a no-op in a truly simplistic implementation without a fallback to OS-assisted sleep/wake. The standard clearly wanted to allow such an implementation, but it's still an interesting question whether any mainstream implementation is like that. Or can wake multiple in other ways.Fenella
Where does it say "no more than"? Because spurious wakeups are allowed, surely notify_one is allowed to unblock more than one thread. (the other unblocks being "spurious" wakeups that "just happened" to occur at the same time you called notify_one)Tupiguarani
G
13

Both std::atomic::wait and std::condition_variable::wait are allowed to unblock spuriously at any time, including at the specific time that notify_one is called.

So in terms of the specification, although it does indeed seem that the standard uses different wording for the two ("at least one" in [atomics.types.operation]/32 but just "one" in [thread.condition.condvar]/5), I don't think there is any normative difference here. The implementation is free to unblock any number of threads for both waiting operations when notify_one is called, as long as it is at least one (if any are waiting). This is true for std::atomic::notify_one as well as std::condition_variable::notify_one (i.e. "(no more than)" is wrong).


In an earlier revision of the proposal [P1135R4] for the std::atomic waiting operations it still said "one" instead of "at least one" in the proposed wording.

That was changed with revision [P1135R5], according to which wording improvements were made partly in response to feedback from LWG teleconferences. I don't think any documentation of these teleconferences is publicly available, so I can't tell whether there was a specific intent in this wording change.

Maybe it is supposed to avoid misunderstandings since the spurious unblocking is mentioned in a different paragraph or maybe it is meant to convey that there are implementations which do unblock multiple/all waiting operations that are specifically intended to be supported as such. See also comments under the question and this answer.

Goldin answered 9/6, 2022 at 22:42 Comment(4)
It still seems like a difference in intent of the wording, though. e.g. allowing simplistic spin-wait implementations of wait().Fenella
@PeterCordes Could be the case. I found a change from "one" to "at least one" in one of the proposal revisions, but unfortunately no explanation for it. See edited answer.Goldin
With condition variables, only one waiting thread can resume because they must all have the same lock. But it is true that the other threads could spuriously resume after the first to wake releases the lock.Sanfred
@Sanfred Blocking on the lock happens after initially unblocking from the waiting state, not atomically with it. See timsong-cpp.github.io/cppwp/n4868/thread.condition#general-3 and timsong-cpp.github.io/cppwp/n4868/thread.condition#condvar-8.2Goldin
E
9

As far as I can tell, they can use the same mechanism, but don't have to. The implementation I typically use is libstdc++ and its atomic::notify_one() is definitely different.

The condition_variable is fairly opaque, implemented as __condvar in the bits/std_mutex.h header. Its notify_one() just calls a pthread wrapper:

void
notify_one() noexcept
{
  int __e __attribute__((__unused__)) = __gthread_cond_signal(&_M_cond);
  __glibcxx_assert(__e == 0);
}

whereas notify_all() instead calls __gthread_cond_broadcast.

This contrasts with the atomics, which are much more involved, but end up using a sequence of functions for notifying threads, starting with __atomic_notify_address:

template<typename _Tp>
void
__atomic_notify_address(const _Tp* __addr, bool __all) noexcept
{
  __detail::__bare_wait __w(__addr);
  __w._M_notify(__all);
}

__waiter_base::_M_notify:

void
_M_notify(bool __all, bool __bare = false)
{
  if (_M_laundered())
  {
    __atomic_fetch_add(_M_addr, 1, __ATOMIC_SEQ_CST);
    __all = true;
  }
  _M_w._M_notify(_M_addr, __all, __bare);
}

__waiter_pool_base::_M_notify:

void
_M_notify(const __platform_wait_t* __addr, bool __all, bool __bare) noexcept
{
  if (!(__bare || _M_waiting()))
    return;

#ifdef _GLIBCXX_HAVE_PLATFORM_WAIT
  __platform_notify(__addr, __all);
#else
  if (__all)
    _M_cv.notify_all();
  else
    _M_cv.notify_one();
#endif
}

So if there is no _GLIBCXX_HAVE_PLATFORM_WAIT, libstdc++ will simply use a condition variable to notify either one or all threads. Otherwise, it's implemented using a futex, and may also wake (up to?) INT_MAX threads, see __platform_notify:

template<typename _Tp>
void
__platform_notify(const _Tp* __addr, bool __all) noexcept
{
  syscall (SYS_futex, static_cast<const void*>(__addr),
           static_cast<int>(__futex_wait_flags::__wake_private),
           __all ? INT_MAX : 1);
}
Eyed answered 9/6, 2022 at 22:19 Comment(3)
The futex(2) man page for FUTEX_WAKE says "Most commonly, val is specified as either 1 (wake up a single waiter) or INT_MAX (wake up all waiters)". So it seems INT_MAX is a special case, because in general This operation wakes at most val of the waiters that are waiting. But INT_MAX says "all", not "up to all", and for things like a barrier() where you want all threads to wait and then all continue, you don't want some still stuck.Fenella
@PeterCordes The wording confuses me, because at the same time it says "No guarantee is provided about which waiters are awoken (e.g., a waiter with a higher scheduling is not guaranteed to be awoken in preference to a with a lower priority)." I guess we can't know for sure without looking at the return value.Eyed
My reading of that sentence about "which waiters" was that it's only relevant to cases where you ask for less-than-all waiters to be woken. If you're waking all of them, they all get woken. Although presumably still without a guarantee of nice level affecting which ones get woken first if there are more tasks than cores to run them. (Nor realtime priority levels like SCHED_FIFO.)Fenella

© 2022 - 2024 — McMap. All rights reserved.