Synchronizing very fast threads

Asked 8/10, 2015 at 9:9 Answered 8/10, 2015 at 14:32

Solved c++multithreading synchronization

In the following example (an idealized "game") there are two threads. The main thread which updates data and RenderThread which "renders" it to the screen. What I need it those two to be synchronized. I cannot afford to run several update iteration without running a render for every single one of them.

I use a condition_variable to sync those two, so ideally the faster thread will spend some time waiting for the slower. However condition variables don't seem to do the job if one of the threads completes an iteration for a very small amount of time. It seems to quickly reacquire the lock of the mutex before wait in the other thread is able to acquire it. Even though notify_one is called

#include <iostream>
#include <thread>
#include <chrono>
#include <atomic>
#include <functional>
#include <mutex>
#include <condition_variable>

using namespace std;

bool isMultiThreaded = true;

struct RenderThread
{
    RenderThread()
    {
        end = false;
        drawing = false;
        readyToDraw = false;
    }

    void Run()
    {
        while (!end)
        {
            DoJob();
        }
    }

    void DoJob()
    {
        unique_lock<mutex> lk(renderReadyMutex);
        renderReady.wait(lk, [this](){ return readyToDraw; });
        drawing = true;

        // RENDER DATA
        this_thread::sleep_for(chrono::milliseconds(15)); // simulated render time
        cout << "frame " << count << ": " << frame << endl;
        ++count;

        drawing = false;
        readyToDraw = false;

        lk.unlock();
        renderReady.notify_one();
    }

    atomic<bool> end;

    mutex renderReadyMutex;
    condition_variable renderReady;
    //mutex frame_mutex;
    int frame = -10;
    int count = 0;

    bool readyToDraw;
    bool drawing;
};

struct UpdateThread
{
    UpdateThread(RenderThread& rt)
        : m_rt(rt)
    {}

    void Run()
    {
        this_thread::sleep_for(chrono::milliseconds(500));

        for (int i = 0; i < 20; ++i)
        {
            // DO GAME UPDATE

            // when this is uncommented everything is fine
            // this_thread::sleep_for(chrono::milliseconds(10)); // simulated update time

            // PREPARE RENDER THREAD
            unique_lock<mutex> lk(m_rt.renderReadyMutex);
            m_rt.renderReady.wait(lk, [this](){ return !m_rt.drawing; });

            m_rt.readyToDraw = true;

            // SUPPLY RENDER THREAD WITH DATA TO RENDER
            m_rt.frame = i;

            lk.unlock();
            m_rt.renderReady.notify_one();

            if (!isMultiThreaded)
                m_rt.DoJob();
        }        

        m_rt.end = true;
    }

    RenderThread& m_rt;
};

int main()
{
    auto start = chrono::high_resolution_clock::now();

    RenderThread rt;
    UpdateThread u(rt);

    thread* rendering = nullptr;
    if (isMultiThreaded)
        rendering = new thread(bind(&RenderThread::Run, &rt));

    u.Run();

    if (rendering)
        rendering->join();

    auto duration = chrono::high_resolution_clock::now() - start;
    cout << "Duration: " << double(chrono::duration_cast<chrono::microseconds>(duration).count())/1000 << endl;


    return 0;
}

Here is the source of this small example code, and as you can see even on ideone's run the output is frame 0: 19 (this means that the render thread has completed a single iteration, while the update thread has completed all 20 of its).

If we uncomment line 75 (ie simulate some time for the update loop) everything runs fine. Every update iteration has an associated render iteration.

Is there a way to really truly sync those threads, even if one of them completes an iteration in mere nanoseconds, but also without having a performance penalty if they both take some reasonable amount of milliseconds to complete?

Distributive answered 8/10, 2015 at 9:9 Comment(1)

This looks like a traditional producer-consumer problem. Please have a look at vichargrave.com/multithreaded-work-queue-in-c – Garver 8/10, 2015 at 9:36

If I understand correctly, you want the 2 threads to work alternately: updater wait until the renderer finish before to iterate again, and the renderer wait until the updater finish before to iterate again. Part of the computation could be parallel, but the number of iteration shall be similar between both.

You need 2 locks:

one for the updating
one for the rendering

Updater:

wait (renderingLk)
update
signal(updaterLk)

Renderer:

wait (updaterLk)
render
signal(renderingLk)

EDITED:

Even if it look simple, there are several problems to solve:

Allowing part of the calculations to be made in parallel: As in the above snippet, update and render will not be parallel but sequential, so there is no benefit to have multi-thread. To a real solution, some the calculation should be made before the wait, and only the copy of the new values need to be between the wait and the signal. Same for rendering: all the render need to be made after the signal, and only getting the value between the wait and the signal.

The implementation need to care also about the initial state: so no rendering is performed before the first update.

The termination of both thread: so no one will stay locked or loop infinitely after the other terminate.

Unprejudiced answered 8/10, 2015 at 9:40 Comment(4)

Indeed... This is really the best and simplest solution. How silly of me :) Sorry for that. I guess I was too distracted that the the update thread was so fast and I didn't think of that elementary solution... – Distributive 8/10, 2015 at 10:54

The idea is simple, but the implementation not so much. I do not know how familiar you are with these type of problems, I will edit the solution to make some notes to care about. – Unprejudiced 8/10, 2015 at 11:2

Since the code that syncs the thread is guarded by mutexes in my example it's really a matter of a single line. m_rt.drawing = true at line 85. That's it. – Distributive 8/10, 2015 at 11:4

Why use two threads if they should not work parallely? – Manvel 17/8, 2022 at 15:17

I think a mutex (alone) is not the right tool for the job. You might want to consider using a semaphore (or something similar) instead. What you describe sound a lot like a producer/consumer problem, i.e., one process is allowed to run once everytime another process has finnished a task. Therefore you might also have a look at producer/consumer patterns. For example this series might get you some ideas:

A multi-threaded Producer Consumer with C++11

There a std::mutex is combined with a std::condition_variable to mimic the behavior of a semaphore. An approach that appears quite reasonable. You would probably not count up and down but rather toggle true and false a variable with needs redraw semantics.

For reference:

Witmer answered 8/10, 2015 at 9:40 Comment(2)

You should never need to use yield() in correctly synchronized code – Incontrollable 8/10, 2015 at 11:13

@JonathanWakely That is probably true. I removed the note. – Witmer 8/10, 2015 at 11:48

This is because you use a separate drawing variable that is only set when the rendering thread reacquires the mutex after a wait, which may be too late. The problem disappears when the drawing variable is removed and the check for wait in the update thread is replaced with ! m_rt.readyToDraw (which is already set by the update thread and hence not susceptible to the logical race.

Modified code and results

That said, since the threads do not work in parallel, I don't really get the point of having two threads. Unless you should choose to implement double (or even triple) buffering later.

Phyllotaxis answered 8/10, 2015 at 14:32 Comment(0)

A technique often used in computer graphics is to use a double-buffer. Instead of having the renderer and the producer operate on the same data in memory, each one has its own buffer. This is implemented by using two independent buffers, and switch them when needed. The producer updates one buffer, and when it is done, it switches the buffer and fills the second buffer with the next data. Now, while the producer is processing the second buffer, the renderer works with the first one and displays it.

You could use this technique by letting the renderer lock the swap operation such that the producer may have to wait until rendering is finished.

Takeshi answered 8/10, 2015 at 11:11 Comment(0)

Recommended topics

Hot tags