Primitive synchronization primitives -- safe?
Asked Answered
W

5

4

On constrained devices, I often find myself "faking" locks between 2 threads with 2 bools. Each is only read by one thread, and only written by the other. Here's what I mean:

bool quitted = false, paused = false;
bool should_quit = false, should_pause = false;

void downloader_thread() {
    quitted = false;
    while(!should_quit) {
        fill_buffer(bfr);
        if(should_pause) {
            is_paused = true;
            while(should_pause) sleep(50);
            is_paused = false;
        }
    }
    quitted = true;
}

void ui_thread() {
    // new Thread(downloader_thread).start();
    // ...
    should_pause = true;
    while(!is_paused) sleep(50);
        // resize buffer or something else non-thread-safe
    should_pause = false;
}

Of course on a PC I wouldn't do this, but on constrained devices, it seems reading a bool value would be much quicker than obtaining a lock. Of course I trade off for slower recovery (see "sleep(50)") when a change to the buffer is needed.

The question -- is it completely thread-safe? Or are there hidden gotchas I need to be aware of when faking locks like this? Or should I not do this at all?

Weinstock answered 23/4, 2009 at 18:15 Comment(2)
sleep(50) is very slow recovery - the parameter is in seconds. Compared with a semaphore, I think you probably introduce more average delay per op this way...Syllabary
Agree'd, you should definately be doing a spin lock here.Invade
P
5

Unless you understand the memory architecture of your device in detail, as well as the code generated by your compiler, this code is not safe.

Just because it seems that it would work, doesn't mean that it will. "Constrained" devices, like the unconstrained type, are getting more and more powerful. I wouldn't bet against finding a dual-core CPU in a cell phone, for instance. That means I wouldn't bet that the above code would work.

Papillon answered 23/4, 2009 at 18:20 Comment(1)
"I wouldn't bet against finding a dual-core CPU in a cell phone, for instance" - that is funny to read 6 years later with QuadCores in phones already :)Coley
J
6

Using bool values to communicate between threads can work as you intend, but there are indeed two hidden gotchas as explained in this blog post by Vitaliy Liptchinsky:

Cache Coherency

A CPU does not always fetch memory values from RAM. Fast memory caches on the die are one of the tricks used by CPU designers to work around the Von Neumann bottleneck. On some multi-cpu or multi-core architectures (like Intel's Itanium) these CPU caches are not shared or automatically kept in sync. In other words, your threads may be seeing different values for the same memory address if they run on different CPU's.

To avoid this you need to declare your variables as volatile (C++, C#, java), or do explicit volatile read/writes, or make use of locking mechanisms.

Compiler Optimizations

The compiler or JITter may perform optimizations which are not safe if multiple threads are involved. See the linked blog post for an example. Again, you must make use of the volatile keyword or other mechanisms to inform you compiler.

Jinajingle answered 23/4, 2009 at 23:3 Comment(3)
Volatile does not imply memory barriers. It will not help. You really need the proper synchronization primitives, which are not available as part of the C language yet but should be available (as _Atomic types) in C1x. Until then, you must use asm with the right locking instructions, or the primitives your threads implementations (pthread or whatever).Beaty
@R..: are you saying that when using volatile, threads may still see different values for the same memory address? Or are you talking about some other problem?Jinajingle
Yes, I am. On x86 this is unlikely. The problem is limited to a few non-mainstream clone brands, which, as far as I know, does not include any chips by Intel, AMD, Cyrix, VIA, etc. But on less civilized RISC-y architectures, memory synchronization between CPUs can be hell. All the volatile keyword does is ensure that an actual load is generated for each abstract-machine read and an actual store is generated for each abstract-machine write. It does not impose additional cache synchronization between cpus.Beaty
P
5

Unless you understand the memory architecture of your device in detail, as well as the code generated by your compiler, this code is not safe.

Just because it seems that it would work, doesn't mean that it will. "Constrained" devices, like the unconstrained type, are getting more and more powerful. I wouldn't bet against finding a dual-core CPU in a cell phone, for instance. That means I wouldn't bet that the above code would work.

Papillon answered 23/4, 2009 at 18:20 Comment(1)
"I wouldn't bet against finding a dual-core CPU in a cell phone, for instance" - that is funny to read 6 years later with QuadCores in phones already :)Coley
C
0

Concerning the sleep call, you could always just do sleep(0) or the equivalent call that pauses your thread letting the next in line a turn.

Concerning the rest, this is thread safe if you know the implementation details of your device.

Charged answered 23/4, 2009 at 18:22 Comment(0)
P
0

Answering the questions.

Is this completely thread safe? I would answer no this is not thread safe and I would just not do this at all. Without knowing the details of our device and compiler, if this is C++, the compiler is free to reorder and optimize things away as it sees fit. e.g. you wrote:

is_paused = true;            
while(should_pause) sleep(50);            
is_paused = false;

but the compiler may choose to reorder this into something like this:

sleep(50);
is_paused = false;

this probably won't work even a single core device as others have said.

Rather than taking a lock, you may try to do better to just do less on the UI thread rather than yield in the middle of processing UI messages. If you think that you have spent too much time on the UI thread then find a way to cleanly exit and register an asynchronous call back.

If you call sleep on a UI thread (or try to acquire a lock or do anyting that may block) you open the door to hangs and glitchy UIs. A 50ms sleep is enough for a user to notice. And if you try to acquire a lock or do any other blocking operation (like I/O) you need to deal with the reality of waiting for an indeterminate amount of time to get the I/O which tends to translate from glitch to hang.

Purloin answered 23/4, 2009 at 22:28 Comment(0)
V
0

This code is unsafe under almost all circumstances. On multi-core processors you will not have cache coherency between cores because bool reads and writes are not atomic operations. This means each core is not guarenteed to have the same value in the cache or even from memory if the cache from the last write hasn't been flushed.

However, even on resource constrained single core devices this is not safe because you do not have control over the scheduler. Here is an example, for simplicty I'm going to pretend these are the only two threads on the device.

When the ui_thread runs, the following lines of code could be run in the same timeslice.

// new Thread(downloader_thread).start();
// ...
should_pause = true;

The downloader_thread runs next and in it's time slice the following lines are executed:

quitted = false;
while(!should_quit)
{
    fill_buffer(bfr);

The scheduler prempts the downloader_thread before fill_buffer returns and then activates the ui_thread which runs.

while(!is_paused) sleep(50);
// resize buffer or something else non-thread-safe
should_pause = false;

The resize buffer operation is done while the downloader_thread is in the process of filling the buffer. This means the buffer is corrupted and you'll likely crash soon. It won't happen everytime, but the fact that you are filling the buffer before you set is_paused to true makes it more likely to happen, but even if you switched the order of those two operations on the downloader_thread you would still have a race condition, but you'd likely deadlock instead of corrupting the buffer.

Incidentally, this is a type of spinlock, it just doesn't work. Spinlock's aren't very for wait times that are likely to span to many time slices cause the spin the processor. Your implmentation does sleep which is a bit nicer but the scheduler still has to run your thread and thread context switches aren't cheap. If you are waiting on a critical section or semaphore, the scheduler doesn't active your thread again till the resource has become free.

You might be able to get away with this in some form on a specific platform/architecture, but it is really easy to make a mistake that is very hard to track down.

Vesicle answered 24/4, 2009 at 7:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.