Why 'wait with predicate' solves the 'lost wakeup' for condition variable?
Asked Answered
S

1

5

I am trying to understand the difference between spurious vs lost wakeup in case of a condition variable. Following is small piece code I tried. I understand that 'consumer' in this case could wake up without any notification and therefore the wait needs to check for predicate.

But how does wait with predicate solves the issue of 'lost wakeup'? As you can see in code below; 'wait' is not called for 5 seconds and I was expecting it to miss first few notifications; but with predate, it does not miss any. Are these notifications saved for future wait?

#include <iostream>
#include <deque>
#include <condition_variable>
#include <thread>

std::deque<int> q;
std::mutex m;
std::condition_variable cv;

void dump_q()
{
    for (auto x: q) {
        std::cout << x << std::endl;
    }
}

void producer()
{
    for(int i = 0; i < 10; i++) {
        std::unique_lock<std::mutex> locker(m);
        q.push_back(i);
        std::cout << "produced: " << i << std::endl;
        cv.notify_one();

        std::this_thread::sleep_for(std::chrono::seconds(1));
        locker.unlock();
    }
}

void consumer()
{
    while (true) {
        int data = 0;
        std::this_thread::sleep_for(std::chrono::seconds(5));   // <- should miss first 5 notications?
        std::unique_lock<std::mutex> locker(m); 
        cv.wait(locker);
        //cv.wait(locker, [](){return !q.empty();});  // <- this fixes both spurious and lost wakeups
        data = q.front();
        q.pop_front();
        std::cout << "--> consumed: " << data << std::endl;
        locker.unlock();
    }
}

int main(int argc, char *argv[])
{
    std::thread t1(producer);
    std::thread t2(consumer);
    t1.join();
    t2.join();
    return 0;
}
Shoot answered 30/5, 2019 at 19:55 Comment(0)
A
10

It is the atomic "unlock and wait" operation that prevents lost wakeups. A lost wakeup happens this way:

  1. We acquire the lock that protects the data.
  2. We check to see whether we need to wait and we see that we do.
  3. We need to release the lock because otherwise no other thread can access the data.
  4. We wait for a wakeup.

You can see the risk of a lost wakeup here. Between steps 3 and 4, another thread could acquire the lock and send a wakeup. We have released the lock, so another thread can do this, but we aren't waiting yet, so we wouldn't get the signal.

So long as step 2 is done under the protection of the lock and steps 3 and 4 are atomic, there is no risk of a lost wakeup. A wakeup cannot be sent until the data is modified which cannot be done until another thread acquires the lock. Since 3 and 4 are atomic, any thread that sees the lock as unlocked will necessarily also see us waiting.

This atomic "unlock and wait" is the primary purpose of condition variables and the reason they must always be associated with a mutex and a predicate.

In code above, consumer is not waiting for first few notifications because it is sleeping. Is it not missing notify in this case? Is this case not similar to race condition between #3 and #4?

Nope. Can't happen.

Either the consumer that is not waiting holds the lock or it doesn't. If the consumer that is not waiting holds the lock, it can't miss anything. The predicate cannot change when it holds the lock.

If the consumer is not holding the lock, then it doesn't matter what it misses. When it checks to see whether it should lock in step 2, if it missed anything, it will necessarily see it in step 2 and it will see that it does not need to wait, so it will not wait for the wakeup that it missed.

So if the predicate is such that the thread does not need to wait, the thread will not wait because it checks the predicate. There is no opportunity for a missed wakeup prior to step 1.

The only time an actual wakeup is needed is if a thread goes to sleep. The atomic unlock and sleep ensures that a thread can only decide to go to sleep while it holds the lock and while the thing it needs to wait for has not yet happened.

Arc answered 30/5, 2019 at 20:0 Comment(4)
In code above, consumer is not waiting for first few notifications because it is sleeping. Is it not missing notify in this case? Is this case not similar to race condition between #3 and #4? Thank you for helping.Shoot
@spa I'll update my answer. Step 2 cannot happen in that case, so we never get to step 4.Arc
@DavidSchwartz, Sorry to ask this question late. I am still not sure of one thing. Are you saying the consumer will surely miss notifications (because of the sleep, std::this_thread::sleep_for(std::chrono::seconds(5));) ? But as the consumer is checking for a predicate, it doesn't mater whether it misses them?Mclaren
@Mclaren Correct. It doesn't matter if it misses a wakeup if it wasn't waiting for a wakeup at the time anyway.Arc

© 2022 - 2024 — McMap. All rights reserved.