do integer reads need to be critical section protected?
Asked Answered
S

5

10

I have come across C++03 some code that takes this form:

struct Foo {
    int a;
    int b;
    CRITICAL_SECTION cs;
}

// DoFoo::Foo foo_;

void DoFoo::Foolish()
{
    if( foo_.a == 4 )
    {
        PerformSomeTask();

        EnterCriticalSection(&foo_.cs);
        foo_.b = 7;
        LeaveCriticalSection(&foo_.cs);
    }
}

Does the read from foo_.a need to be protected? e.g.:

void DoFoo::Foolish()
{
    EnterCriticalSection(&foo_.cs);
    int a = foo_.a;
    LeaveCriticalSection(&foo_.cs);

    if( a == 4 )
    {
        PerformSomeTask();

        EnterCriticalSection(&foo_.cs);
        foo_.b = 7;
        LeaveCriticalSection(&foo_.cs);
    }
}

If so, why?

Please assume the integers are 32-bit aligned. The platform is ARM.

Shilling answered 30/11, 2012 at 18:16 Comment(7)
Also note that C++11 states that any race condition involving a write as Undefined Behavior. So if you're writing to foo_.a in a different thread, then yes, it's UB. (§1.10/4 and §1.10/21) C++03 says nothing about concurrency.Perquisite
Use std::atomic if you can and don't worry about it.Pyrotechnics
Please see the edit. This code is limited to C++03 constructs.Shilling
Since you tagged with WinAPI, I'll point out that you can used the Interlocked... functions to do atomic integer reads and writes with slightly less overhead than a critical section. (On Windows, the <atomic> features in C++11 probably use the Interlocked... functions under the hood.)Codon
@AdrianMcCarthy: There is no InterlockedRead, unfortunately. AFAIK you can use InterlockedCompareExchange(v, 0, 0) at a slight performance cost, or write your own InterlockedRead using compiler assumptions and extensions (like _MemoryBarrier and the fact that the contract of the function assumes the supplied integer is aligned properly, and ensure the type is such that the platforms Windows run on gives that integer atomic reads).Pyrotechnics
@PaulH: Your approach should then be to emulate std::atomic's behaviors for your use case.Pyrotechnics
possible duplicate of Reading interlocked variablesCodon
K
11

Technically yes, but no on many platforms. First, let us assume that int is 32 bits (which is pretty common, but not nearly universal).

It is possible that the two words (16 bit parts) of a 32 bit int will be read or written to separately. On some systems, they will be read separately if the int isn't aligned properly.

Imagine a system where you can only do 32-bit aligned 32 bit reads and writes (and 16-bit aligned 16 bit reads and writes), and an int that straddles such a boundary. Initially the int is zero (ie, 0x00000000)

One thread writes 0xBAADF00D to the int, the other reads it "at the same time".

The writing thread first writes 0xBAAD to the high word of the int. The reader thread then reads the entire int (both high and low) getting 0xBAAD0000 -- which is a state that the int was never put into on purpose!

The writer thread then writes the low word 0xF00D.

As noted, on some platforms all 32 bit reads/writes are atomic, so this isn't a concern. There are other concerns, however.

Most lock/unlock code includes instructions to the compiler to prevent reordering across the lock. Without that prevention of reordering, the compiler is free to reorder things so long as it behaves "as-if" in a single threaded context it would have worked that way. So if you read a then b in code, the compiler could read b before it reads a, so long as it doesn't see an in-thread opportunity for b to be modified in that interval.

So possibly the code you are reading is using these locks to make sure that the read of the variable happens in the order written in the code.

Other issues are raised in the comments below, but I don't feel competent to address them: cache issues, and visibility.

Kush answered 30/11, 2012 at 18:19 Comment(12)
@dmajj, on many platforms, yes, but not necessarily on all platforms.Couperin
You forgot to mention caching effects.Wolfe
On all current major CPU architectures (including ARM), reads and writes of naturally aligned ints and pointers are atomic. Note that you still need the critical section on the write because of compiler and CPU ordering issues.Pb
@dmaij The problem is that it breaks down if the variable is promoted to a register. Furthermore, the relaxed memory model of modern processors can do weird things if you don't force a memory barrier via atomic or critical section.Perquisite
@dmaij: Is it really up to opinion? Unless the definition of correctness in this case is "I don't need this to be accurate or well-defined" (which can be the case sometimes but certainly not for the overwhelming majority nor for OP), you either are factually writing working code or not.Pyrotechnics
@VladLazarenko -- I don't feel sufficiently qualified to describe how they would screw you up in particular (not well enough to include an example).Kush
Being atomic isn't sufficient. You also have to worry about visibility and compiler re-ordering.Barbabas
As @PeteBecker says, you need at least volatile to safely roll your own atomic reads. See Who's afraid of a big bad optimizing compiler? re: how the Linux kernel rolls its own atomics, using de-facto compiler support of the same kind you're depending on to do this in C++03. There are some pretty tricky gotchas, besides the well-known ones like while(!flag){ spin; } optimizing into if(!flag) { while(42){} }. (MCU programming - C++ O2 optimization breaks while loop)Terrier
Depending on the surrounding code and what your compiler had for breakfast, it might make the same asm for a read of a plain int as it would have with volatile, or with std::atomic<int> with memory_order_relaxed. So this definitely does happen to work in a lot of cases, and in 2012 it probably made sense not to use very new C++11 features. But in 2022, there's a lot less excuse for writing unsafe code. At least use *(volatile T*)&foo if you're stuck in a codebase without std::atomic for shared vars.Terrier
@PeterCordes -- I did not say that you need at least volatile. volatile might be appropriate, but only if your compiler makes that promise.Barbabas
@PeteBecker: Right sorry, as you said, a hardware guarantee that load/store of int size being atomic isn't sufficient. The part about volatile working on compilers which at least de-facto support that is on me. That is normally the case for GCC and clang at least. (related: Which types on a 64-bit computer are naturally atomic in gnu C and gnu C++? - sometimes compilers choose to split up a 64-bit store on a RISC machine with a constant like 0xdeadbeefdeadbeef. IIRC I found mention of GCC sometimes splitting even with volatile, but maybe a bugTerrier
Actually that splitting constants thing will only affect stores, and this Q&A is about loads. Err, but not necessarily Interlocked stores, it's including plain stores inside critical sections, so actually Nate's example on Which types on a 64-bit computer are naturally atomic in gnu C and gnu C++? -- meaning they have atomic reads, and atomic writes is a perfect counterexample to loads being safe. The write would also have to make sure it was done atomically. With GCC/clang, making the variable itself volatile would work (affecting the store as well),Terrier
S
3

Looking at this it seems that arm has quite relaxed memory model so you need a form of memory barrier to ensure that writes in one thread are visible when you'd expect them in another thread. So what you are doing, or else using std::atomic seems likely necessary on your platform. Unless you take this into account you can see updates out of order in different threads which would break your example.

Silverts answered 30/11, 2012 at 18:33 Comment(0)
I
2

I think you can use C++11 to ensure that integer reads are atomic, using (for example) std::atomic<int>.

Irra answered 30/11, 2012 at 18:20 Comment(0)
B
2

The C++ standard says that there's a data race if one thread writes to a variable at the same time as another thread reads from that variable, or if two threads write to the same variable at the same time. It further says that a data race produces undefined behavior. So, formally, you must synchronize those reads and writes.

There are three separate issues when one thread reads data that was written by another thread. First, there is tearing: if writing requires more than a single bus cycle, it's possible for a thread switch to occur in the middle of the operation, and another thread could see a half-written value; there's an analogous problem if a read requires more than a single bus cycle. Second, there's visibility: each processor has its own local copy of the data that it's been working on recently, and writing to one processor's cache does not necessarily update another processor's cache. Third, there's compiler optimizations that reorder reads and writes in ways that would be okay within a single thread, but will break multi-threaded code. Thread-safe code has to deal with all three problems. That's the job of synchronization primitives: mutexes, condition variables, and atomics.

Barbabas answered 30/11, 2012 at 19:45 Comment(0)
S
0

Although the integer read/write operation indeed will most likely be atomic, the compiler optimizations and processor cache will still give you problems if you don't do it properly.

To explain - the compiler will normally assume that the code is single-threaded and make many optimizations that rely on that. For example, it might change the order of instructions. Or, if it sees that the variable is only written and never read, it might optimize it away entirely.

The CPU will also cache that integer, so if one thread writes it, the other one might not get to see it until a lot later.

There are two things you can do. One is to wrap in in critical section like in your original code. The other is to mark the variable as volatile. That will signal the compiler that this variable will be accessed by multiple threads and will disable a range of optimizations, as well as placing special cache-sync instructions (aka "memory barriers") around accesses to the variable (or so I understand). Apparently this is wrong.

Added: Also, as noted by another answer, Windows has Interlocked APIs that can be used to avoid these issues for non-volatile variables.

Stranglehold answered 30/11, 2012 at 18:27 Comment(6)
According to many, marking a variable as volatile was not really intended for variables that may be modified on another thread, though the Microsoft documentation for volatile suggests that it is, and the question is tagged WinAPI.Codon
@AdrianMcCarthy - Well, I'm not an expert. :) I just know that this is about the only use of volatile that I can think of.Stranglehold
Sorry to -1, but we don't need to spread the lie that volatile has anything to due with multithreading any more. If it did, C++11 wouldn't have added std::atomic. Cut any idea you have that volatile is helpful for multithreading. (Disclaimer: some compilers, notably MSVC, gave now-deprecated extensions to volatile which make my previous language-based claim false; this was seen in hindsight as a mistake.)Pyrotechnics
More specifically, this sentence is wrong: "That will signal the compiler that this variable will be accessed by multiple threads". All it says is "reads and writes to this variable are observable behavior (a technical term), so make sure you do it". It says nothing about ordering, atomicity, or any other guarantees you need for correct multithreaded code.Pyrotechnics
@Pyrotechnics - OK, thank you for correcting me. Can you point me to any articles (or explain yourself) why it is wrong, and then what is it exactly that volatile does?Stranglehold
@Vilx-: No problem. A good start is here, and a PDF (so I won't link) titled "C++ and the Perils of Double-Checked Locking".Pyrotechnics

© 2022 - 2024 — McMap. All rights reserved.