Which types on a 64-bit computer are naturally atomic in gnu C and gnu C++? -- meaning they have atomic reads, and atomic writes

Asked 14/4, 2022 at 4:31 Answered 14/4, 2022 at 6:46

c++c gcc x86-64 atomic

NB: For this question, I'm not talking about the C or C++ language standards. Rather, I'm talking about gcc compiler implementations for a particular architecture, as the only guarantees for atomicity by the language standards are to use _Atomic types in C11 or later or std::atomic<> types in C++11 or later. See also my updates at the bottom of this question.

On any architecture, some data types can be read atomically, and written atomically, while others will take multiple clock cycles and can be interrupted in the middle of the operation, causing corruption if that data is being shared across threads.

On 8-bit single-core AVR microcontrollers (ex: the ATmega328 mcu, used by the Arduino Uno, Nano, or Mini), only 8-bit data types have atomic reads and writes (with the gcc compiler and gnu C or gnu C++ language). I had a 25-hr debugging marathon in < 2 days and then wrote this answer here. See also the bottom of this question for more info. and documentation on 8-bit variables having naturally atomic writes and naturally atomic reads for AVR 8-bit microcontrollers when compiled with the gcc compiler which uses the AVR-libc library.

On (32-bit) STM32 single-core microcontrollers, any data type 32-bits or smaller is definitively automatically atomic (when compiled with the gcc compiler and the gnu C or gnu C++ language, as ISO C and C++ make no guarantees of this until the 2011 versions with _Atomic types in C11 and std::atomic<> types in C++11). That includes bool/_Bool, int8_t/uint8_t, int16_t/uint16_t, int32_t/uint32_t, float, and all pointers. The only not atomic types are int64_t/uint64_t, double (8 bytes), and long double (also 8 bytes). I wrote about that here:

Now I need to know for my 64-bit Linux computer. Which types are definitively automatically atomic?

My computer has an x86-64 processor, and Linux Ubuntu OS.

I am okay using Linux headers and gcc extensions.

I see a couple of interesting things in the gcc source code indicating that at least the 32-bit int type is atomic. Ex: the Gnu++ header <bits/atomic_word.h>, which is stored at /usr/include/x86_64-linux-gnu/c++/8/bits/atomic_word.h on my computer, and is here online, contains this:

typedef int _Atomic_word;

So, int is clearly atomic.

And the Gnu++ header <bits/types.h>, included by <ext/atomicity.h>, and stored at /usr/include/x86_64-linux-gnu/bits/types.h on my computer, contains this:

/* C99: An integer type that can be accessed as an atomic entity,
   even in the presence of asynchronous interrupts.
   It is not currently necessary for this to be machine-specific.  */
typedef int __sig_atomic_t;

So, again, int is clearly atomic.

Here is some sample code to show what I am talking about...

...when I say that I want to know which types have naturally atomic reads, and naturally atomic writes, but not atomic increment, decrement, or compound assignment.

volatile bool shared_bool;
volatile uint8_t shared u8;
volatile uint16_t shared_u16;
volatile uint32_t shared_u32;
volatile uint64_t shared_u64;
volatile float shared_f; // 32-bits
volatile double shared_d; // 64-bits

// Task (thread) 1
while (true)
{
    // Write to the values in this thread.
    //
    // What I write to each variable will vary. Since other threads are reading
    // these values, I need to ensure my *writes* are atomic, or else I must
    // use a mutex to prevent another thread from reading a variable in the
    // middle of this thread's writing.
    shared_bool = true;
    shared_u8 = 129;
    shared_u16 = 10108;
    shared_u32 = 130890;
    shared_f = 1083.108;
    shared_d = 382.10830;
}

// Task (thread) 2
while (true)
{
    // Read from the values in this thread.
    //
    // What thread 1 writes into these values can change at any time, so I need
    // to ensure my *reads* are atomic, or else I'll need to use a mutex to
    // prevent the other thread from writing to a variable in the midst of
    // reading it in this thread.
    if (shared_bool == whatever)
    {
        // do something
    }
    if (shared_u8 == whatever)
    {
        // do something
    }
    if (shared_u16 == whatever)
    {
        // do something
    }
    if (shared_u32 == whatever)
    {
        // do something
    }
    if (shared_u64 == whatever)
    {
        // do something
    }
    if (shared_f == whatever)
    {
        // do something
    }
    if (shared_d == whatever)
    {
        // do something
    }
}

C `_Atomic` types and C++ `std::atomic<>` types

I know C11 and later offers _Atomic types, such as this:

const _Atomic int32_t i;
// or (same thing)
const atomic_int_least32_t i;

See here:

And C++11 and later offers std::atomic<> types, such as this:

const std::atomic<int32_t> i;
// or (same thing)
const atomic_int32_t i;

See here:

https://en.cppreference.com/w/cpp/atomic/atomic

And these C11 and C++11 "atomic" types offer atomic reads and atomic writes as well as atomic increment operator, decrement operator, and compound assignment...

...but that's not really what I'm talking about.

I want to know which types have naturally atomic reads and naturally atomic writes only. For what I am talking about, increment, decrement, and compound assignment will not be naturally atomic.

Update 14 Apr. 2022

I had some chats with someone from ST, and it seems the STM32 microcontrollers only guarantee atomic reads and writes for variables of certain sizes under these conditions:

You use assembly.
You use the C11 _Atomic types or the C++11 std::atomic<> types.
You use the gcc compiler with gnu language and gcc extensions.
1. I'm most interested in this last one, since that's what the crux of my assumptions at the top of this question seem to have been based on for the last 10 years, without me realizing it. I'd like help finding the gcc compiler manual and the places in it where it explains these atomic access guarantees that apparently exist. We should check the:
  1. AVR gcc compiler manual for 8-bit AVR ATmega microcontrollers.
  2. STM32 gcc compiler manual for 32-bit ST microcontrollers.
  3. x86-64 gcc compiler manual??--if such a thing exists, for my 64-bit Ubuntu computer.

My research thus far:

AVR gcc: no avr gcc compiler manual exists. Rather, use the AVR-libc manual here: https://www.nongnu.org/avr-libc/ --> "Users Manual" links.
1. The AVR-libc user manual in the <util/atomic> section backs up my claim that 8-bit types on AVR, when compiled by gcc, already have naturally atomic reads and naturally atomic writes when it implies that 8-bit reads and writes are already atomic by saying (emphasis added):
A typical example that requires atomic access is a 16 (or more) bit variable that is shared between the main execution path and an ISR.
1. It is talking about C code, not assembly, as all examples it gives on that page are in C, including the one for the volatile uint16_t ctr variable, immediately following that quote.

Thilde answered 14/4, 2022 at 4:31 Comment(21)

This is dependent on both the processor and the compiler. It seems you are interested only in the case of x86-64 and gcc, seeing as you are digging into internal headers. But I'm not sure. If you are looking for a portable answer, use is_always_lock_free to detect which types are atomically readable/updatable. (And you have to use atomic<> to get the atomic behavior.) – Saire 14/4, 2022 at 4:40

Are you asking about the hardware or the language? For the hardware, there are no such thing as type. For the language, there are no atomic types other than those that are provided as atomics. – Overflight 14/4, 2022 at 4:40

@RaymondChen, a good demo of std::atomic<T>::is_always_lock_free() could be useful. What does it mean to be "lock free", exactly? An answer could be useful. Also, is there an equivalent to this in C? I frequently use both languages and would like to know a C solution too. – Thilde 14/4, 2022 at 4:45

@RaymondChen, also std::atomic<> goes beyond just simple atomic reads and writes. See here: en.cppreference.com/w/cpp/atomic/atomic. It offers atomic increment, decrement, and compound assignment such as operator+=, too, meaning, I suspect they are using locking of some sort under-the-hood in the implementation of that. – Thilde 14/4, 2022 at 4:46

@PasserBy, hardware has no atomic types? That doesn't sound right. Looking at the hardware of 8-bit AVR mcus and 32-bit STM32 mcus, for instance, the hardware clearly is tied to the size of the variables for which atomic reads and writes are supported. Furthermore, the read-modify-write instructions required to increment mutexes must also be supported at the hardware level. – Thilde 14/4, 2022 at 4:48

What I mean is, the idea of types in the language doesn't exist in the hardware. It's all just instructions. – Overflight 14/4, 2022 at 4:50

@PasserBy I think what the OP is looking for is that the HW has a unit of data that can be manipulated atomically (e.g. 1 byte or 4 bytes) -- and there's a C type that corresponds to this (e.g. 4 bytes == int). – Predesignate 14/4, 2022 at 4:57

@Predesignate The problem is, the idea of atomics in the language doesn't map cleanly to hardware. The language says none of those are atomic other than the explicitly atomic ones. Worse yet, C++ says any type can be used in std::atomic. So the question might be, which atomic types are lock free? But that's not all, there's atomic operations on atomic types which aren't a single instruction even if it's lock free. – Overflight 14/4, 2022 at 5:3

Correction to my previous comment: implement mutexes, not increment mutexes. – Thilde 14/4, 2022 at 5:4

@Barmar, correct. That's exactly what I'm saying. – Thilde 14/4, 2022 at 5:5

@PasserBy That's why he said "naturally atomic". He's not looking for something that implements atomicity using extra code, just the types where the hardware implements things like increment atomically. – Predesignate 14/4, 2022 at 5:6

@PasserBy, The language says. I'm not asking what the language says though, exactly. The language for AVR is C or C++, and the language for STM32 is C or C++, yet the hardware says what types are atomic there, and we have definitive answers for the languages, despite the languages not specifying. In other words, for the languages, the answer is likely unspecified. But, for the compiler on a given architecture, it is likely well-defined, like in AVR and STM32. I see your points about lock free types and multiple instruction-types though. – Thilde 14/4, 2022 at 5:7

I'm not sure how reliably you can generalize about "a 64-bit computer". One 64-bit computer might do things one way; a different 64-bit computer might do things differently. Hardware manufacturers aren't required to make guarantees of atomicity if they don't want to. – Arlenarlena 14/4, 2022 at 5:11

@Barmar, correct again, except not where the hardware implements things like increment atomically, but rather where it implements writes atomically, and reads atomically, as things like increment are not naturally atomic on AVR nor STM32, so I suspect they won't be on x86-64 (or other 64-bit full computer processors) either. – Thilde 14/4, 2022 at 5:11

@JeremyFriesner, maybe there are some quick checks we can do though to see, at initialization in the code, kind of like how here is some code I wrote to check endianness? Does std::atomic<>::is_always_lock_free() do that? I don't really know what "always lock free" means, and unfortunately that call acts on std::atomic<> types only, not regular types. – Thilde 14/4, 2022 at 5:13

@GabrielStaples as I understand it, std::atomic<>::is_always_lock_free() returns true iff the compiler can guarantee that that std::atomic type will never require the implicit locking/unlocking of a mutex to implement its atomicity guarantees. It's probably what you want. – Arlenarlena 14/4, 2022 at 5:16

It's an extremely common misunderstanding that just because the compiler can read a certain size of data in a single instruction, code using variables with that size or smaller magically turns atomic. That assumption only applies to assembler, never to C. See this: Using volatile in embedded C development That answer also contains a much simpler and better way of protecting variables from race conditions on MCU systems than the answer you linked, by simply using a boolean flag variable. – Debt 14/4, 2022 at 6:21

@Lundin, I left a comment on your answer you linked-to. You cannot claim 8-bit writes cannot be guaranteed to be atomic and then use an 8-bit write with a bool semaphore which must be guaranteed to be atomic. You must either acknowledge that bool semaphore is useless, or acknowledge that the 8-bit write is atomic. As it is written, it is contradictory. I say the 8-bit write is atomic. – Thilde 14/4, 2022 at 6:48

There are two issues: (1) What can the CPU perform atomically? A: Read the CPU data sheet. (2) How do I convince my compiler to perform those operations? A: Use the language-defined atomic data types. In C++, you would static_assert(std::atomic<int32_t>::is_always_lock_free()) to verify that the compiler supports the underlying CPU operation, and then use value.load(std::memory_order_relaxed) to perform an unordered read or value.store(newvalue, std::memory_order_relaxed) to perform an unordered write. Unordered reads/writes almost always compile to a single load or store instruction. – Saire 14/4, 2022 at 14:20

And to add ro @Ray – Paraffinic 14/4, 2022 at 17:46

If you're interested specifically in gcc, you can read the gcc documentation on forced atomic memory access intrinsics. If you want to guarantee atomic access, use those intrinsics. Otherwise, the compiler may choose to use a non-atomic access. – Saire 14/4, 2022 at 19:4

The answer from the point of view of the language standard is very simple: none of them are "definitively automatically" atomic.

First of all, it's important to distinguish between two senses of "atomic".

One is atomic with respect to signals. This ensures, for instance, that when you do x = 5 on a volatile sig_atomic_t, then a signal handler invoked in the current thread will see either the old or new value. This is usually accomplished simply by doing the access in one instruction, since signals can only be triggered by hardware interrupts, which can only arrive between instructions. For instance, x86 add dword ptr [var], 12345, even without a lock prefix, is atomic in this sense.
The other is atomic with respect to threads, so that another thread accessing the object concurrently will see a correct value. This is more difficult to get right. In particular, ordinary variables of type volatile sig_atomic_t are not atomic with respect to threads. You need _Atomic or std::atomic to get that.

Note well that the internal names your implementation chooses for its types are not evidence of anything. From typedef int _Atomic_word; I would certainly not infer that "int is clearly atomic"; I don't know in what sense the implementers were using the word "atomic", or whether it's accurate (could be used by legacy code, for instance). If they wanted to make such a promise it would be in the documentation, not in an unexplained typedef in a bits header that is never meant to be seen by the application programmer.

The fact that your hardware may make certain types of access "automatically atomic" does not tell you anything at the level of C/C++. For instance, it is true on x86 that ordinary full-size loads and stores to naturally aligned variables are atomic. But in the absence of std::atomic, the compiler is under no obligation to emit ordinary full-size loads and stores; it is entitled to be clever and access those variables in other ways. It "knows" this will be no problem, because concurrent access would be a data race, and of course the programmer would never write code with a data race, would they?

As a concrete example, consider the following code:

unsigned x;

unsigned foo(void) {
    return (x >> 8) & 0xffff;
}

A load of a nice 32-bit integer variable, followed by some arithmetic. What could be more innocent? Yet check out the assembly emitted by GCC 11.2 -O2 try on godbolt:

foo:
        movzx   eax, WORD PTR x[rip+1]
        ret

Oh dear. A partial load, and unaligned to boot.

Fortunately, x86 does guarantee that a 16-bit load or store contained within an aligned dword is atomic, even if unaligned, on P5 Pentium or later. In fact, any 1, 2, or 4-byte load or store that fits within an aligned 8-byte is atomic on x86-64, so this would be a valid optimization even if x had been std::atomic<int>. But in that case GCC would have missed the optimization.

Both Intel and AMD separately guarantee this. Intel for P5 Pentium and later which includes all their x86-64 CPUs. There is no single "x86" document that lists the common subset of atomicity guarantees. A stack overflow answer lists combines the guarantees from those two vendors; presumably it's also atomic on other vendors like Via / Zhaoxin.

Hopefully also guaranteed in any emulators or binary-translators that turn this x86 instruction into AArch64 machine code for example, but that's definitely something to worry about if there isn't a matching atomicity guarantee on the host machine.

Here is another fun example, this time on ARM64. Aligned 64-bit stores are atomic, per B2.2.1 of the ARMv8-A Architecture Reference Manual. So this looks fine:

unsigned long x;

void bar(void) {
    x = 0xdeadbeefdeadbeef;
}

But, GCC 11.2 -O2 gives (godbolt):

bar:
        adrp    x1, .LANCHOR0
        add     x2, x1, :lo12:.LANCHOR0
        mov     w0, 48879
        movk    w0, 0xdead, lsl 16
        str     w0, [x1, #:lo12:.LANCHOR0]
        str     w0, [x2, 4]
        ret

That's two 32-bit strs, not atomic in any way. A reader may very well read 0x00000000deadbeef.

Why do it this way? Materializing a 64-bit constant in a register takes several instructions on ARM64, with its fixed instruction size. But both halves of the value are equal, so why not materialize the 32-bit value and store it to each half?

(If you do unsigned long *p; *p = 0xdeadbeefdeadbeef then you get stp w1, w1, [x0] (godbolt). Which looks more promising as it is a single instruction, but in base ARMv8-A it is in fact is still two separate writes for purposes of atomicity between threads. The LSE2 feature, optional in ARMv8.2-A and mandatory in ARMv8.4-A, does make ldp/stp atomic under reasonable alignment conditions.)

User supercat's answer to Are concurrent unordered writes with fencing to shared memory undefined behavior? has another nice example for ARM32 Thumb, where the C source asks for an unsigned short to be loaded once, but the generated code loads it twice. In the presence of concurrent writes, you could get an "impossible" result.

One can provoke the same on x86-64 (godbolt):

_Bool x, y, z;

void foo(void) {
    _Bool tmp = x;
    y = tmp;
    // imagine elaborate computation here that needs lots of registers
    z = tmp;
}

GCC will reload x instead of spilling tmp. On x86 you can load a global with just one instruction, but spilling to the stack would need at least two. So if x is being concurrently modified, either by threads or by signals/interrupts, then assert(y == z) afterwards could fail.

It really isn't safe to assume anything beyond what the languages actually guarantees, which is nothing unless you use std::atomic. Modern compilers know the exact limits of the language rules very well, and optimize aggressively. They can and will break code that assumes they will do what would be "natural", if that is outside the bounds of what the language promises, and they will very often do it in ways that one would never expect.

Mindoro answered 14/4, 2022 at 5:59 Comment(23)

Thank you for the answer. You said it is true on x86 that ordinary full-size loads and stores to naturally aligned variables are atomic. What is a "full-size load"? 64-bits, since it's a 64-bit architecture? – Thilde 14/4, 2022 at 6:15

Can you comment on 8-bit AVR and 32-bit STM-32 microcontrollers? What makes them different? Why/how is code written on them withOUT the use of _Atomic and std::atomic<> safe, with clearly-defined boundaries of which types have atomic reads and which don't? – Thilde 14/4, 2022 at 6:19

@GabrielStaples: A 16-bit load of a 16-bit variable, a 32-bit load of a 32-bit variable, a 64-bit load of a 64-bit variable. – Mindoro 14/4, 2022 at 6:19

Oh, right. "Load" meaning the verb: the load instruction, and "store" meaning the verb: the store instruction. I don't know assembly. I was thinking you meant "full-size load" where "load" is a noun, like a chunk of bits--ex: the standard or maximum chunk of bits the processor can act on in a single instruction. – Thilde 14/4, 2022 at 6:21

@GabrielStaples: I do not have direct experience with AVR or ARM32, but I strongly suspect that it isn't safe. Programmers may be writing code based on pre-C11 habits and figuring "it hasn't failed a test yet, so it must be fine". Or taking advantage of inside knowledge or experience about optimizations that their particular compilers don't currently do, even though they in principle could. – Mindoro 14/4, 2022 at 6:21

@GabrielStaples It is very rare that a variable access boils down to a single instruction. Most often they are copied between registers and the stack, so you get multiple instructions. This is especially true for antique cores like AVR that pretty much can't do anything at all without loading data into registers first. – Debt 14/4, 2022 at 6:27

Nate, I just did a quick check. The Arduino IDE for instance, which compiles in C++, doesn't have the <atomic> header to use std::atomic<> types, doesn't allow including the <stdatomic.h> header, and doesn't allow using _Atomic. Other tools must be used. – Thilde 14/4, 2022 at 7:14

The convention I have seen for a decade is to do nothing to protect 8-bit reads and writes (since they are atomic), and disable global interrupts to protect increment, decrement, compound assignment, or even simple reads or writes of >8-bit types. – Thilde 14/4, 2022 at 7:17

I've written extensive code with interrupts and never had a problem with data races between volatile global variables shared between ISR contexts and the main code so long as I perfectly followed those rules. The assumption of 8-bit reads and writes being atomic (but NOT increment, decrement, compound operations, etc), seems to be correct by observation. – Thilde 14/4, 2022 at 7:19

@GabrielStaples: I wouldn't feel safe unless I saw explicit promises in the compiler manual. The nature of C/C++ is that "I've never had a problem" is not strong evidence for "it is correct". In my ARM64 example, you could go your whole life storing constants to 64-bit variables and finding them atomic, until one day someone changes 0xdeadbeefdeadbeee to 0xdeadbeefdeadbeef, and then you get to enjoy inexplicable bug reports. – Mindoro 14/4, 2022 at 7:20

Fair enough. Here is the main library manual for avr-libc, used by AVR 8-bit mcus: nongnu.org/avr-libc/user-manual/pages.html. The only "atomic" support is in <util/atomic.h> here, which provides some fancy macros with gcc extensions to provide C++-like destructor capability in C in order to disable global interrupts at the start of the ATOMIC_BLOCK(ATOMIC_RESTORESTATE), then automatically restore them at the end of the atomic block to how they were before entering the ATOMIC_BLOCK. I've studied them. – Thilde 14/4, 2022 at 7:23

Both AVR and STM32 are single core too. That probably makes a difference. – Thilde 14/4, 2022 at 7:43

What resources can you recommend to help me learn how to read the assembly you posted? – Thilde 14/4, 2022 at 7:43

Yeah, on a single-core CPU in a single-processor system, you typically only need atomicity with respect to interrupts, which is roughly equivalent to the "atomic with respect to signals" in my first bullet. Notably, the data race rules do not apply, and volatile sig_atomic_t is defined to be sufficient. Mere volatile by itself will usually suffice for what you need; compilers are more likely to promise this formally or informally. – Mindoro 14/4, 2022 at 14:39

Resources: for x86, lots of resources here. For ARM64, the formal reference is the Architecture Reference Manual; I found the Cortex-A Programmer's Guide to be friendlier as a learning text. – Mindoro 14/4, 2022 at 14:46

@GabrielStaples: Since you expressed interest in x86-64, I added an example (inspired by supercat's) in which bool fails to be atomic against modification by either threads or signal/interrupt handlers. – Mindoro 14/4, 2022 at 17:53

@NateEldredge and Lundin, I've added some important notes at the top of my question, and updates at the bottom. – Thilde 14/4, 2022 at 18:57

AFAIK x86 provides no atomicity promises about unaligned loads. - Actually, the common subset of guarantees from AMD and Intel do guarantee the atomicity of loading the middle 16 bits of an aligned dword, on P5 Pentium or newer, even from uncacheable memory. Why is integer assignment on a naturally aligned variable atomic on x86? In general unaligned word loads aren't atomic, because they could split across wider boundaries. But within a 32-bit chunk is always safe, as is within a qword in cacheable memory. – Sherr 11/6, 2022 at 18:53

The real problems from trying to use plain integer types for concurrency isn't usually atomicity per-se (on real implementations compilers tend to use single loads/stores), it's ordering or multiple accesses. e.g. assuming the value won't change so it can be read more than once, inventing loads. See Who's afraid of a big bad optimizing compiler? re: the badness that can happen if Linux kernel code wasn't careful to use READ_ONCE or WRITE_ONCE macros that cast to volatile int*, instead just using barriers to block optimization. – Sherr 11/6, 2022 at 18:57

Your ARM64 example is good, though; a real example of a compiler turning an assignment into two stores, and value-dependent no less. (IDK why it doesn't use a shifted OR to create the full constant, but there's no correctness problem with what it's doing. Anything it breaks was dependent on UB. Or stp, which GCC12.1 does.) Ah, godbolt.org/z/8hqr6o3ra shows it using stp w1,w1 with a pointer arg, but with a global var it gets mixed up on generating addresses for both halves? Either way, volatile forces a single non-stp store, the semantics that the Linux kernel depends on. – Sherr 11/6, 2022 at 19:9

ARMv8.4a guarantees that ldp/stp on an aligned 128-bit location is atomic. (reviews.llvm.org/D67485). So perhaps narrower stp would be as well on ARMv8.4a. But before that, maybe not guaranteed on paper, although I wouldn't be surprised if many implementations merge the register values into a single 64-bit store. Anyway, not that it matters as the answer to this question; due to a missed optimization, some GCC versions don't stp. Also, LWN article mentions store tearing for constants as something on many RISCs; many don't have a store-pair. – Sherr 11/6, 2022 at 19:32

Reported gcc.gnu.org/bugzilla/show_bug.cgi?id=105928 (constant generation in a register, at least with -Os) / gcc.gnu.org/bugzilla/show_bug.cgi?id=105929 (ARMv8.4-a guarantees atomicity of stp w1,w1, mem for any address inside an aligned 16-byte chunk, in normal cacheable memory, allowing the stp trick even for _Atomic relaxed, and volatile.). – Sherr 11/6, 2022 at 20:23

Is unaligned access in Cortex-M4 atomic? - your unaligned 16-bit load from the middle of unsigned x could tear on ARMv7-M, and there are multi-core Cortex-M microcontrollers. And clang compiles it that way for ARM, vs. GCC loading a word for ubfx. godbolt.org/z/nb5jK91nT – Sherr 13/1 at 7:49

On 8-bit AVR microcontrollers (ex: the ATmega328 mcu, used by the Arduino Uno or Mini), only 8-bit data types have atomic reads and writes.

Only in case you write your code in assembler, not in C.

On (32-bit) STM32 microcontrollers, any data type 32-bits or smaller is definitively automatically atomic.

Only in case you write your code in assembler, not in C. Additionally, only if the ISA guarantees that the generated instruction is atomic, I don't remember if this is true for all ARM instructions.

That includes bool/_Bool, int8_t/uint8_t, int16_t/uint16_t, int32_t/uint32_t, float, and all pointers.

No, that is definitely wrong.

Now I need to know for my 64-bit Linux computer. Which types are definitively automatically atomic?

The same types as in AVR and STM32: none.

This all boils down to that a variable access in C cannot be guaranteed to be atomic because it might get carried out in multiple instructions. Or in some cases in instructions for which the ISA doesn't guarantee atomicity.

The only types that can be regarded as atomic in C (and C++) are those with the _Atomic qualifier from C11/C++11. Period.

This answer of mine at EE here is a duplicate. It addresses the microcontroller cases explicitly, race conditions, use of volatile, dangerous optimizations etc. It also contains a simple way to protect from race conditions in interrupts which is applicable to all MCUs where interrupts cannot be interrupted. A quote from that answer:

When writing C, all communication between an ISR and the background program must be protected against race conditions. Always, every time, no exceptions. The size of the MCU data bus does not matter, because even if you do a single 8 bit copy in C, the language cannot guarantee atomicity of operations. Not unless you use the C11 feature _Atomic. If this feature isn't available, you must use some manner of semaphore or disable the interrupt during read etc. Inline assembler is another option. volatile does not guarantee atomicity.

Debt answered 14/4, 2022 at 6:46 Comment(10)

I left some comments there. There seems to be a circular contradiction you're making when you said, even if you do a single 8 bit copy in C, the language cannot guarantee atomicity of operations, because to "protect" your variables you then do an 8-bit write which must be atomic to be correct yet which you just said is not atomic. Am I missing something? – Thilde 14/4, 2022 at 6:55

@GabrielStaples I replied. The bool trick does not rely on bool being atomic, but that the access to the bool must be done first and fully evaluated before the protected code may or may not get executed. It won't matter if the bool check gets interrupted some x times before that happens. This is possible since there is no instruction re-ordering and since an interrupt cannot get interrupted in turn (unless you fiddle with the global interrupt mask from inside the ISR). – Debt 14/4, 2022 at 7:0

I see what you mean. Side note though: STM32 microcontrollers by default have nested interrupts enabled, using an NVIC (Nested Vector Interrupt Controller), so, disabling the appropriate interrupts generally makes the most sense to protect variables and also not miss data. Ex: see my usage of HAL_NVIC_DisableIRQ(USART1_IRQn); to just disable one particular interrupt, rather than globally disabling interrupts, in my answer here. Anyway, I see your bool semaphore point. Multiple ways to do things I suppose, as always. – Thilde 14/4, 2022 at 7:9

@GabrielStaples Multiple nested interrupts isn't a problem either as long as they don't access the same variables. That should only be a problem in case multiple interrupts are handled by the same ISR though. And the work-around is probably to let each such ISR access an unique variable/an unique index in some array. Just as done in multi-threading on hosted systems. – Debt 14/4, 2022 at 9:23

@Debt how do you explain that read in C is not atomic operation in ARM? I don't catch. No matter how you do it, or whatever operation you use it is always: Load register address, read from address pointed to by register. How is this not atomic? At least for variables with up to bus-size width. Variables higher than that can have shadow registers to get read atomicy for instance. – Hudibrastic 14/4, 2022 at 17:12

@tilz0R You have it backwards: how do you prove that when you write C code, your compiler always translates accesses that you feel should be atomic into instructions that actually implement the read atomically. Just read the example in the other answer. – Saito 14/4, 2022 at 17:20

@AndrewHenle Is there a way you can have an instruction where read is not atomic? If so - how please, for instance for 32-bit variable in 32-bit system? I don't know - hence the question. – Hudibrastic 14/4, 2022 at 17:21

@tilz0R There's an example in the other answer (as I write this) that shows a compiler on ARM turning a 64-bit load into two 32-bit loads - obviously that can't be atomic. There's no guarantee that won't happen with a 32-bit load if you don't explicitly tell the compiler to do an atomic load. – Saito 14/4, 2022 at 17:24

Lundin, regarding your response about my 8-bit AVR mcu claim, you said: Only in case you write your code in assembler, not in C. The AVR-libc user manual in the <util/atomic> section disagrees with you, and implies that 8-bit reads and writes are already atomic when it says,

A typical example that requires atomic access is a 16 (or more) bit variable that is shared between the main execution path and an ISR.

It is talking about C code, not assembly, as all examples it gives on that page are in C. – Thilde 14/4, 2022 at 18:41

@GabrielStaples Again, there are no guarantees that an 8 bit access from C code is atomic. The compiler might generate instructions giving atomic access, but you can't rely on that. – Debt 19/4, 2022 at 6:23

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Here is some sample code to show what I am talking about...

C _Atomic types and C++ std::atomic<> types

Update 14 Apr. 2022

Recommended topics

Hot tags

C `_Atomic` types and C++ `std::atomic<>` types