Asked 5/2, 2016 at 14:4 Answered 11/4 at 22:31

Solved c++c multithreading thread-safety atomic

Are C/C++ fundamental types, like int, double, etc., atomic, e.g. threadsafe?

Are they free from data races; that is, if one thread writes to an object of such a type while another thread reads from it, is the behavior well-defined?

If not, does it depend on the compiler or something else?

Warila answered 5/2, 2016 at 14:4 Comment(19)

Why do you think they should be? I've never worked with a procedural programming language in which variables of fundamental types were atomic. – Filiform 5/2, 2016 at 14:8

preshing.com/20130618/atomic-vs-non-atomic-operations – Calandra 5/2, 2016 at 14:43

No, but they do decay. – Superfetation 5/2, 2016 at 19:2

Do you mean atomic as in "a reader will never see a value with a mix of old and new bytes" (i.e. "tearing")? Or does "well defined" mean the full sequential-consistency guarantees of std::atomic: ordering with respect to other loads/stores? Either way, the C++ standard doesn't provide either guarantee, not even for a char AFAIK. On most hardware, the first (std::atomic<T>::store(val, std::memory_order_relaxed)) is free up to the size of a register (but that still doesn't make a read-modify-write ++i free if you want the whole RMW to be atomic). – Christan 6/2, 2016 at 10:15

Since C/C++ are compiled into machine code, surely it's hardware-dependent as to the behaviour of multiple writes and reads to the same memory location? – Quag 6/2, 2016 at 10:19

@Doddy: It's a lot better to think about writing code to the C++ standard. If you start taking the behaviour of the implementation on your dev box as "the way C++ works", you're going to have a Bad Time, either in the future with a new compiler, or in the future when compiling for a different architecture. E.g. a C++ implementation on AVR needs to do extra work even for an atomic store with relaxed ordering for anything larger than one byte. gcc.godbolt.org unfortunately only has g++ 4.5 for AVR, and doesn't have the full set of libraries anyway. – Christan 6/2, 2016 at 10:23

Anyway, on further reading of the question, and the wording in the standard, this question is apparently asking about the second sense. And the answer is: No, of course not. It that would impose MASSIVE performance penalties for every read-modify-write to be atomic, and to put memory barriers between every memory access. Like, factor of 5 to 100 slowdown is my guess, depending on what the code is doing and what CPU it's running on. Maybe even lower than that for functions that mostly read, not write, on a strongly-ordered architecture like x86, or do a lot with locals. – Christan 6/2, 2016 at 10:25

@ChristianHackl Pretty sure C# guarantees atomic operations for any types 4 bytes or less. I think it's reasonable to think this would be the case. He isn't stating he thinks they should be, simply asking if they are. – Hero 9/2, 2016 at 22:47

@ChadSchouggins: #2434272 – Filiform 10/2, 2016 at 5:36

Possible duplicate of CRITICAL_SECTION for set and get single bool value – Ginoginsberg 22/2, 2016 at 9:0

@ChristianHackl "I've never worked with a procedural programming language in which variables of fundamental types were atomic" Not even Java? – Deutschland 11/12, 2018 at 22:45

@curiousguy: Yes, not even Java. – Filiform 12/12, 2018 at 6:20

@PeterCordes, and what about sense 1? Like, having a read operation of a long read two bytes from the previous value and two bytes from the new value? Considering a long fits a word in a 64 bit system (assuming 8 byte long here), would that still be possible, I wonder? I don't know, it feels to me like the compilers should guarantee that. Either case, I'm using an atomic_long to be safe. Thanks! – Weixel 7/10, 2022 at 19:29

@MarcioLucca: ISO C doesn't guarantee it. Data-race UB is UB, so literally anything can happen, including having it happen to work as you expect some of the time but not all of the time. e.g. Which types on a 64-bit computer are naturally atomic in gnu C and gnu C++? -- meaning they have atomic reads, and atomic writes - on AArch64, x = 0xaaaaaaaaaaaaaaaa compiles to two stores of 0xaaaaaaaa. But constants where the two halves are different do happen to get stored with a single 64-bit str w. GCC, See also lwn.net/Articles/793253 – Christan 7/10, 2022 at 19:38

That's a bummer, lol. Anyway, thanks a lot for the quick response @PeterCordes – Weixel 7/10, 2022 at 19:50

@MarcioLucca: Why a bummer? What did you hope to gain from it that you couldn't with memory_order_relaxed for std::atomic? – Christan 7/10, 2022 at 19:54

@PeterCordes: Nothing really, just simplicity, perhaps. If the compiler/standards give you such guarantees, then you (or especially people beginning to learn the language) don't have to think about it and that's a good thing, I claim. Performance-wise, personally, I'm not too concerned for my use case. That being said, like you mentioned, atomic_longs also solve "sense number 2" (i.e. ordering) and I'm assuming there must be a small penalty for that. In my case, again, I don't need that, I'm fine with two threads reading slightly outdated values. Thanks again! – Weixel 7/10, 2022 at 20:5

@MarcioLucca: Lol, even if they'd been free of tearing when loads or stores actually happen, you absolutely would have to think about it very hard, to make sure your code was safe. e.g. data races being UB lets compilers hoist loads out of loops, like while(!ready) {} into if(!ready) while(42){}. MCU programming - C++ O2 optimization breaks while loop. You only asked about tearing, not stopping the compiler from optimizing variables into registers (let alone memory ordering wrt. other operations), so I didn't mention this earlier. – Christan 7/10, 2022 at 20:15

C and C++ are not the same language nor is C++ a superset of C. Therefor it is inappropriate to ask a question of both of them. – Chilopod 11/4 at 23:47

No, fundamental data types (e.g., int, double) are not atomic, see std::atomic.

Instead you can use std::atomic<int> or std::atomic<double>.

Note: std::atomic was introduced with C++11 and my understanding is that prior to C++11, the C++ standard didn't recognize the existence of multithreading at all.

As pointed out by @Josh, std::atomic_flag is an atomic boolean type. It is guaranteed to be lock-free, unlike the std::atomic specializations.

The quoted documentation is from: http://open-std.org/JTC1/SC22/WG21/docs/papers/2015/n4567.pdf. I'm pretty sure the standard is not free and therefore this isn't the final/official version.

1.10 Multi-threaded executions and data races

Two expression evaluations conflict if one of them modifies a memory location (1.7) and the other one reads or modifies the same memory location.

The library defines a number of atomic operations (Clause 29) and operations on mutexes (Clause 30) that are specially identified as synchronization operations. These operations play a special role in making assignments in one thread visible to another. A synchronization operation on one or more memory locations is either a consume operation, an acquire operation, a release operation, or both an acquire and release operation. A synchronization operation without an associated memory location is a fence and can be either an acquire fence, a release fence, or both an acquire and release fence. In addition, there are relaxed atomic operations, which are not synchronization operations, and atomic read-modify-write operations, which have special characteristics.

Two actions are potentially concurrent if
(23.1) — they are performed by different threads, or
(23.2) — they are unsequenced, and at least one is performed by a signal handler.
The execution of a program contains a data race if it contains two potentially concurrent conflicting actions, at least one of which is not atomic, and neither happens before the other, except for the special case for signal handlers described below. Any such data race results in undefined behavior.

29.5 Atomic types

There shall be explicit specializations of the atomic template for the integral types ``char, signed char, unsigned char, short, unsigned short, int, unsigned int, long, unsigned long, long long, unsigned long long, char16_t, char32_t, wchar_t, and any other types needed by the typedefs in the header <cstdint>. For each integral type integral, the specialization atomic<integral> provides additional atomic operations appropriate to integral types. There shall be a specialization atomic<bool> which provides the general atomic operations as specified in 29.6.1..

There shall be pointer partial specializations of the atomic class template. These specializations shall have standard layout, trivial default constructors, and trivial destructors. They shall each support aggregate initialization syntax.

29.7 Flag type and operations

Operations on an object of type atomic_flag shall be lock-free. [ Note: Hence the operations should also be address-free. No other type requires lock-free operations, so the atomic_flag type is the minimum hardware-implemented type needed to conform to this International standard. The remaining types can be emulated with atomic_flag, though with less than ideal properties. — end note ]

Vibes answered 5/2, 2016 at 14:8 Comment(12)

Objects of atomic types are the only C++ objects that are free from data races. Really? How about std::mutex then? (Playing devil's advocate here, it's just that that sentence needs a bit of love and some reference into the Standard.) – Psalmist 5/2, 2016 at 14:14

@Psalmist Those aren't my own words. They are just a snippet from the linked documentation. I don't have a copy of the standard. – Vibes 5/2, 2016 at 14:20

That documentation is then wrong. It's a community wiki which sums up some concepts; in this case it's a bit too approximative and skims over the fact that there are many other data types != std::atomic which are free from data races. Only the Standard is the Voice of The One True ^W^W^W^W the reference here. – Psalmist 5/2, 2016 at 14:41

@Psalmist The C++14 Standard states: 1.10 Multi-threaded executions and data races ... The execution of a program contains a data race if it contains two potentially concurrent conflicting actions, at least one of which is not atomic, and neither happens before the other, except for the special case for signal handlers described below. Any such data race results in undefined behavior. – Presbyterate 5/2, 2016 at 15:12

@AndrewHenle: I know. However the sentence "are the only C++ objects" is wrong. They're not. Counterexample: §30.4.1.2.5 [thread.mutex.requirements.mutex]: "The implementation shall provide lock and unlock operations, as described below. For purposes of determining the existence of a data race, these behave as atomic operations (1.10)". (Again, I was playing devil's advocate and asking for a more formal answer, not a c&p from a summary on a random wiki.) – Psalmist 5/2, 2016 at 15:46

@Psalmist I've updated the answer to try and address your concerns. – Vibes 5/2, 2016 at 15:58

@pepe It's a technicality, really. "Data race" is a term which is defined very specifically in the spec, dealing with the reading and writing of values in memory when more than one thread can access the data at the same time. Mutexes can protect objects from data races, by preventing more than one thread from accessing them, but thats different from making those types data race free. Meanwhile, the mutex function calls are just that, function calls. They are not data accesses, so they are not subject to the concept of data races (unless you destroy a mutex which locking it) – Fucoid 5/2, 2016 at 20:34

And as for the specific quote you gave, what they are doing is promoting a behavior which does not actually fit into the "data race" category into acting as though they had data races because that was the easiest wording to describe how mutexes and atomics interact. Do not that they do not say "lock and unlock are data race free" it says you may treat them as though they are atomic operations for purposes of identifying data races. – Fucoid 5/2, 2016 at 20:36

@CortAmmon: I was not talking at all about mutexes to protect against data races. I was talking about the fact that atomic types are not the only data race-free types; for instance, locking mutexes is a data race-free operation. So I wanted the answer to be modified to quote the Standard about data races, the fact that they're guaranteed not to happen only on conflicting modifications of atomic objects and of a number of other types (mutex, its companions, etc.); and notably, the primitive types are not amongst these, so they're not safe. – Psalmist 5/2, 2016 at 21:25

@JamesAdkison: for reference, you can download the current working draft of the standard for free on isocpp.org (link to Github on the left side). – Psalmist 5/2, 2016 at 21:26

Why did everyone fail to mention that in reality, the only type which is guaranteed, by the C++11 standard, to be atomic is called std::atomic_flag? N3337( §29.7.2) – Miraculous 6/2, 2016 at 14:6

@Josh Thank you, I wasn't aware of this type and have updated the answer. – Vibes 6/2, 2016 at 15:53

Since C is also (currently) mentioned in the question despite not being in the tags, the C Standard states:

5.1.2.3 Program execution

...

When the processing of the abstract machine is interrupted by receipt of a signal, the values of objects that are neither lock-free atomic objects nor of type volatile sig_atomic_t are unspecified, as is the state of the floating-point environment. The value of any object modified by the handler that is neither a lock-free atomic object nor of type volatile sig_atomic_t becomes indeterminate when the handler exits, as does the state of the floating-point environment if it is modified by the handler and not restored to its original state.

and

5.1.2.4 Multi-threaded executions and data races

...

Two expression evaluations conflict if one of them modifies a memory location and the other one reads or modifies the same memory location.

[several pages of standards - some paragraphs explicitly addressing atomic types]

The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior.

Note that values are "indeterminate" if a signal interrupts processing, and simultaneous access to types that are not explicitly atomic is undefined behavior.

Presbyterate answered 5/2, 2016 at 14:23 Comment(3)

Note that C11 adds the _Atomic type qualifier and the <stdatomic.h> header... – Ardin 5/2, 2016 at 15:27

ISO WG14 (C) and WG21 (C++) coordinated to make sure their memory models are similar. That makes it OK to have both the C and C++ tags here. Don't assume that applies to other questions, though ! – Tacky 5/2, 2016 at 16:21

AIUI The guarantees on sig_atomic_t only apply to signal interrupts, not threads or shared memory. – Almira 5/2, 2016 at 18:45

What is atomic?

Atomic, as describing something with the property of an atom. The word atom originates from Latin atomus meaning "undivided".

Typically I think of an atomic operation (regardless of language) to have two qualities:

An atomic operation is always undivided.

I.e. it is performed in an indivisible way, I believe this is what OP refers to as "threadsafe". In a sense the operation happens instantaneously when viewed by another thread.

For example the following operation is likely divided (compiler/hardware dependent):

i += 1;

because it can be observed by another thread (on hypothetical hardware and compiler) as:

load r1, i;
addi r1, #1;
store i, r1;

Two threads doing the above operation i += 1 without appropriate synchronization may produce the wrong result. Say i=0 initially, thread T1 loads T1.r1 = 0, and the thread T2 loads t2.r1 = 0. Both threads increment their respective r1s by 1 and then store the result to i. Although two increments have been performed, the value of i is still only 1 because the increment operation was divisible. Note that had there been synchronization before and after i+=1 the other thread would have waited until the operation was complete and thus would have observed an undivided operation.

Note that even a simple write may or may not be undivided:

i = 3;

store i, #3;

depending on the compiler and hardware. For example if the address of i is not aligned suitably, then an unaligned load/store has to be used which is executed by the CPU as several smaller loads/stores.

An atomic operation has guaranteed memory ordering semantics.

Non atomic operations may be re-ordered and may not necessarily occur in the order written in the program source code.

For example, under the "as-if" rule the compiler is allowed to re-order stores and loads as it sees fit as long as all access to volatile memory occurs in the order specified by the program "as if" the program was evaluated according to the wording in the standard. Thus non-atomic operations may be re-arranged breaking any assumptions about execution order in a multi-threaded program. This is why a seemingly innocent use of a raw int as a signaling variable in multi-threaded programming is broken, even if writes and reads may be indivisible, the ordering may break the program depending on the compiler. An atomic operation enforces ordering of the operations around it depending on what memory semantics are specified. See std::memory_order.

The CPU may also re-order your memory accesses under the memory ordering constraints of that CPU. You can find the memory ordering constraints for the x86 architecture in the Intel 64 and IA32 Architectures Software Developer Manual section 8.2 starting at page 2212.

Primitive types (`int`, `char` etc) are not Atomic

Because even if they under certain conditions may have indivisible store and load instructions or possibly even some arithmetic instructions, they do not guarantee the ordering of stores and loads. As such they are unsafe to use in multi-threaded contexts without proper synchronization to guarantee that the memory state observed by other threads is what you think it is at that point in time.

I hope this explains why primitive types are not atomic.

Engadine answered 5/2, 2016 at 22:51 Comment(5)

@DavidSchwartz Sure, the caches are coherent; it's the store buffers that aren't. Even on x86 - see for instance examples 8-3 and 8-5 in chapter 8.2 of the System Programming Guide. Granted it's hardly the wild west of memory ordering like Alpha or POWER, but to say all cores always read the same values at all times is still strictly false per the architecture. – Kaufman 6/2, 2016 at 1:53

@Kaufman Of course a core won't see a store before that store happens. But there is no "brief moment when the caches of the cores are de-synched". That's just nonsense. – Irena 6/2, 2016 at 2:11

@DavidSchwartz True that that exact wording is erroneous, but the point is there is a period after a write by one core where a read by a different core can still get the old value ("after" in the sense that a read by that first core will return the new value). So the store has both happened and not happened, depending on where you observe from. I'll just point at page 2217 of this and shut up now ;) – Kaufman 6/2, 2016 at 11:32

@Kaufman Either you're trying to accurately explain how actual hardware works or you aren't. If you are, then you failed, since this has nothing to do with the caches. If you aren't, then this is all needless complication and you'd do much better to talk about the standards. This may seem like needless nitpicking, but I've had to correct this kind of misinformation literally hundreds of times when it becomes a source of misinformation cited by other people who misunderstand how the actual hardware works. – Irena 6/2, 2016 at 21:19

Re: the fact that coherent caches are a feature of all real-world CPUs that we run C++ threads across: When to use volatile with multi threading? (never, but it works in practice sort of like mo_relaxed because of coherent caches. In the bad old days before C++11, that's how lock-free code hand-rolled their own atomics, so mainstream compilers did at least de-facto support this.) – Christan 8/10, 2022 at 13:45

An additional info I haven't seen mentioned in the other answers so far:

If you use std::atomic<bool>, for example, and bool is actually atomic on the target architecture, then the compiler will not generate any redundant fences or locks. The same code would be generated as for a plain bool.

In other words, using std::atomic only makes the code less efficient if it is actually required to for correctness on the platform. So there is no reason to avoid it.

Coincidental answered 5/2, 2016 at 23:45 Comment(0)

Implementations are free to specify that "ordinary" accesses to objects will be processed using semantics whose behavior is defined in more situations than required by the Standard. Implementations offering such semantics were used to perform tasks requiring such semantics on machines that offered them essentially "for free", decades before atomic types were added to the C or C++ language standards. If code was targeting platforms where a read of a 16-bit value which occurred at the same time as an attempt to modify it would never do anything other than yield a (possibly meaningless) 16-bit value, then C code that read a 16-bit value would likewise have no effect beyond yielding such a value. There was no perceived need for the Standard to recognize such guarantees, because nobody imagined that compilers targeting platforms that offered them, designed for tasks that would benefit from them, would ever do anything else.

Compilers like gcc, however, will sometimes replace what would appear to be a single load operation with two separate loads, and generate code which will malfunction if they don't yield the same value. As an example, ARM gcc 10.2.1, given command-line arguments -O1 -mcpu=cortex-m0 and the following function:

unsigned test(unsigned short *p)
{
    unsigned short temp = *p;
    temp -= temp >> 15;
    return temp;
}

will generate machine code which is equivalent to the following operations (one machine instruction per line in the function body):

unsigned test(unsigned short *p)
{
  unsigned r2 = *p;
  unsigned r3 = 0;
  unsigned r0 = *(short*)((char*)p+r3); // 16-bit signed load requires index reg
  r3 >>= 16;
  r0 += r2;
  r0 &= 0xFFFF;
  return r0;
}

If the value of *p changes from 0xFFFF to 0x0000 or vice versa between the two loads, the function may return a value which is inconsistent with the original version's load of *p yielding any 16-bit value.

Suspicious answered 11/4 at 22:31 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++