Why is there no wait function for condition_variable which does not relock the mutex
Asked Answered
W

1

7

Consider the following example.

std::mutex mtx;
std::condition_variable cv;

void f()
{
  {
    std::unique_lock<std::mutex>  lock( mtx );
    cv.wait( lock );  // 1
  }
  std::cout << "f()\n";
}

void g()
{
  std::this_thread::sleep_for( 1s );
  cv.notify_one();
}

int main()
{
  std::thread  t1{ f };
  std::thread  t2{ g };
  t2.join();
  t1.join();
}

g() "knows" that f() is waiting in the scenario I would like to discuss. According to cppreference.com there is no need for g() to lock the mutex before calling notify_one. Now in the line marked "1" cv will release the mutex and relock it once the notification is sent. The destructor of lock releases it again immediately after that. This seems to be superfluous especially since locking is expensive. (I know in certain scenarios the mutex needs to be locked. But this is not the case here.)

Why does condition_variable have no function "wait_nolock" which does not relock the mutex once the notification arrives. If the answer is that pthreads do not provide such functionality: Why can`t pthreads be extended for providing it? Is there an alternative for realizing the desired behavior?

Wendel answered 6/10, 2015 at 19:25 Comment(4)
Why do you think this has anything to do with pthreads?Sorosis
You can't reliably use condition_variable this way, due to possibility of spurious wake-ups. A condition_variable normally protects some state, and is used to wait until some predicate over that state becomes true. When it wakes up, you double-check the predicate, and go back to sleep if it's false. But of course access to the shared state must be under a mutex.Inimitable
As usual, you confuse condition variables with signals. POSIX does not have signals, and while condition variables could be used to imitate them, they are not signals per se.Wisp
@CaptainObvlious, de-facto the question has everything to do with pthread. CPP threads implementation is one-to-one mapping to Posix, which is, in my view, an unwelcome event.Wisp
E
16

You misunderstand what your code does.

Your code on line // 1 is free to not block at all. condition_variables can (and will!) have spurious wakeups -- they can wake up for no good reason at all.

You are responsible for checking if the wakeup is spurious.

Using a condition_variable properly requires 3 things:

  • A condition_variable
  • A mutex
  • Some data guarded by the mutex

The data guarded by the mutex is modified (under the mutex). Then (with the mutex possibly disengaged), the condition_variable is notified.

On the other end, you lock the mutex, then wait on the condition variable. When you wake up, your mutex is relocked, and you test if the wakeup is spurious by looking at the data guarded by the mutex. If it is a valid wakeup, you process and proceed.

If it wasn't a valid wakeup, you go back to waiting.

In your case, you don't have any data guarded, you cannot distinguish spurious wakeups from real ones, and your design is incomplete.

Not surprisingly with the incomplete design you don't see the reason why the mutex is relocked: it is relocked so you can safely check the data to see if the wakeup was spurious or not.

If you want to know why condition variables are designed that way, probably because this design is more efficient than the "reliable" one (for whatever reason), and rather than exposing higher level primitives, C++ exposed the lower level more efficient primitives.

Building a higher level abstraction on top of this isn't hard, but there are design decisions. Here is one built on top of std::experimental::optional:

template<class T>
struct data_passer {
  std::experimental::optional<T> data;
  bool abort_flag = false;
  std::mutex guard;
  std::condition_variable signal;

  void send( T t ) {
    {
      std::unique_lock<std::mutex> _(guard);
      data = std::move(t);
    }
    signal.notify_one();
  }
  void abort() {
    {
      std::unique_lock<std::mutex> _(guard);
      abort_flag = true;
    }
    signal.notify_all();
  }        
  std::experimental::optional<T> get() {
    std::unique_lock<std::mutex> _(guard);
    signal.wait( _, [this]()->bool{
      return data || abort_flag;
    });
    if (abort_flag) return {};
    T retval = std::move(*data);
    data = {};
    return retval;
  }
};

Now, each send can cause a get to succeed at the other end. If more than one send occurs, only the latest one is consumed by a get. If and when abort_flag is set, instead get() immediately returns {};

The above supports multiple consumers and producers.

An example of how the above might be used is a source of preview state (say, a UI thread), and one or more preview renderers (which are not fast enough to be run in the UI thread).

The preview state dumps a preview state into the data_passer<preview_state> willy-nilly. The renderers compete and one of them grabs it. Then they render it, and pass it back (through whatever mechanism).

If the preview states come faster than the renderers consume them, only the most recent one is of interest, so the earlier ones are discarded. But existing previews aren't aborted just because a new state shows up.


Questions where asked below about race conditions.

If the data being communicated is atomic, can't we do without the mutex on the "send" side?

So something like this:

template<class T>
struct data_passer {
  std::atomic<std::experimental::optional<T>> data;
  std::atomic<bool> abort_flag = false;
  std::mutex guard;
  std::condition_variable signal;

  void send( T t ) {
    data = std::move(t); // 1a
    signal.notify_one(); // 1b
  }
  void abort() {
    abort_flag = true;   // 1a
    signal.notify_all(); // 1b
  }        
  std::experimental::optional<T> get() {
    std::unique_lock<std::mutex> _(guard); // 2a
    signal.wait( _, [this]()->bool{ // 2b
      return data.load() || abort_flag.load(); // 2c
    });
    if (abort_flag.load()) return {};
    T retval = std::move(*data.load());
    // data = std::experimental::nullopt;  // doesn't make sense
    return retval;
  }
};

the above fails to work.

We start with the listening thread. It does step 2a, then waits (2b). It evaluates the condition at step 2c, but doesn't return from the lambda yet.

The broadcasting thread then does step 1a (setting the data), then signals the condition variable. At this moment, nobody is waiting on the condition variable (the code in the lambda doesn't count!).

The listening thread then finishes the lambda, and returns "spurious wakeup". It then blocks on the condition variable, and never notices that data was sent.

The std::mutex used while waiting on the condition variable must guard the write to the data "passed" by the condition variable (whatever test you do to determine if the wakeup was spurious), and the read (in the lambda), or the possibility of "lost signals" exists. (At least in a simple implementation: more complex implementations can create lock-free paths for "common cases" and only use the mutex in a double-check. This is beyond the scope of this question.)

Using atomic variables does not get around this problem, because the two operations of "determine if the message was spurious" and "rewait in the condition variable" must be atomic with regards to the "spuriousness" of the message.

Escalante answered 6/10, 2015 at 19:38 Comment(18)
I think your argument is correct unless the variable which detects spurious wakeups is atomic. This could be the case if right after the wait a loop is processed and the upper bound is the atomic variable. In the case of a spurious wakeup that loop will not be processed and everything is fine.Wendel
"Then (with the mutex possibly disengaged), the condition_variable is notified" wouldn't this lead to a race condition?Hinson
@Hinson In the non-formal sense, yes. In the sense defined by the C++ standard, no (a C++ standard race condition causes UB: invoking the signal without the mutex doesn't cause UB, but can cause signals to be lost, which is one of the things that a non-C++ standard "race condition" term can describe).Escalante
@Yakk good point, probably would not be bad to mention that in your answer.Hinson
@Hinson Added a footnote, and an extra cautionary inline note.Escalante
@Yakk Can you please recommend a good source to read up on threads semantics, mutices, condition_variables, etc in C++11. I tried to read Eff. Modern C++, and found it not easy to understandPt
@first I am unaware of one: however, C++11 thread support is not intended for "end user" consumption, they are a thin abstraction over low level primitives. They let you write libraries that are easy to use; they are not easy to use.Escalante
@Yakk What then should the "end user" use if one needs to do multithreading in C++? pthreads?Pt
The answer is a good one but I don't think the footnote marked (1) is correct. In the case of a data race between the condition being true and the notification going out, it would not matter since the waiter would have already seen the condition being true (otherwise it would be waiting with the mutex unlocked).Tompion
@richard in a comment above I describe the possibility. It may be in error?Escalante
Keeping the lock while notifying is a pessimisation. It's not necessary because all operations around the condition variable are atomic. The race you are concerned about cannot happen.Tompion
Also have to disagree with the statement in footnote 1. Condition variables don't deliver messages, it's nonsensical to suggest that "messages sent when nobody is waiting are never received." A condition variable provides only a mechanism for threads to block awaiting a predicate to be true and for other threads to notify blocked threads that they need reevaluate the predicate. A design dependent on the number of wakeups being equal to the number of notifications is fundamentally flawed. Notifying inside the lock only increases contention.Garrotte
@Garrotte I figured out what my error was. I was solving the "use an atomic to send data" case, where we modify the "sent" data without using the mutex at all, not the "release the mutex before we send" case. You are correct in that releasing the mutex before the send is safe.Escalante
@RichardHodges Error corrected. See above comment if you are wondering how exactly I got it wrong.Escalante
@user3876684 my previous comment about holding the lock when notifying was in error. Please reread my post, I corrected the problem. It also covers why atomic variables isn't enough now.Escalante
For the atomic case, neither the modification of the data nor the notification need be protected by the mutex. It's sufficient to "poke" the mutex between the modification and notification, i.e., data = 42; std::lock_guard<std::mutex>(guard); signal.notify_one(); when paired with a double-checked pattern on the receive side if (data != 42) { std::unique_lock<std::mutex> _(guard); signal.wait(guard, [this]{ return data == 42 || abort; }); }. Yes this seems silly, but can achieve a lock-free fast path with edge-triggered instead of level-triggered notifications. Overkill for this question.Garrotte
@Garrotte True. I could imagine that being useful in some cases.Escalante
Just had to be a pedant and point out that it is not true that notifying inside the lock only increases contention any more. In fact, quite the reverse, notifying inside the lock cannot possibly allow any thread to make forward progress. A good implementation knows this, and so on good, modern implementations, it is cheaper to notify while holding the lock as that means one less expensive synchronization operation.Concession

© 2022 - 2024 — McMap. All rights reserved.