That's not really re-entrancy; you're not running a function twice in the same thread (or in different threads). You can get that via recursion or passing the address of the current function as a callback function-pointer arg to another function. (And it wouldn't be unsafe because it would be synchronous).
This is just plain vanilla data-race UB (Undefined Behaviour) between a signal handler and the main thread: only sig_atomic_t
is guaranteed safe for this. Others may happen to work, like in your case where an 8-byte object can be loaded or stored with one instruction on x86-64, and the compiler happens to choose that asm. (As @icarus's answer shows).
See MCU programming - C++ O2 optimization breaks while loop - an interrupt handler on a single-core microcontroller is basically the same thing as a signal handler in a single threaded program. In that case the result of the UB is that a load got hoisted out of a loop.
Your test-case of tearing actually happening because of data-race UB was probably developed / tested in 32-bit mode, or with an older dumber compiler that loaded the struct members separately.
In your case, the compiler can optimize the stores out from the infinite loop because no UB-free program could ever observe them. data
is not _Atomic
or volatile
, and there are no other side-effects in the loop.
So there's no way any reader could synchronize with this writer. This in fact happens if you compile with optimization enabled (Godbolt shows an empty loop at the bottom of main). I also changed the struct to two long long
, and gcc uses a single movdqa
16-byte store before the loop. (This is not guaranteed atomic, but it is in practice on almost all CPUs, assuming it's aligned, or on Intel merely doesn't cross a cache-line boundary. Why is integer assignment on a naturally aligned variable atomic on x86?)
So compiling with optimization enabled would also break your test, and show you the same value every time. C is not a portable assembly language.
volatile struct two_int
would also force the compiler not to optimize them away, but would not force it to load/store the whole struct atomically. (It wouldn't stop it from doing so either, though.) Note that volatile
does not avoid data-race UB, but in practice it's sufficient for inter-thread communication and was how people built hand-rolled atomics (along with inline asm) before C11 / C++11, for normal CPU architectures. They're cache-coherent so volatile
is in practice mostly similar to _Atomic
with memory_order_relaxed
for pure-load and pure-store, if used for types narrow enough that the compiler will use a single instruction so you don't get tearing. And of course volatile
doesn't have any guarantees from the ISO C standard vs. writing code that compiles to the same asm using _Atomic
and mo_relaxed.
If you had a function that did global_var++;
on an int
or long long
that you run from main and asynchronously from a signal handler, that would be a way to use re-entrancy to create data-race UB.
Depending on how it compiled (to a memory destination inc or add, or to separate load/inc/store) it would be atomic or not with respect to signal handlers in the same thread. See Can num++ be atomic for 'int num'? for more about atomicity on x86 and in C++. (C11's stdatomic.h
and _Atomic
attribute provides equivalent functionality to C++11's std::atomic<T>
template)
An interrupt or other exception can't happen in the middle of an instruction, so a memory-destination add is atomic wrt. context switches on a single-core CPU. Only a (cache coherent) DMA writer could "step on" an increment from a add [mem], 1
without a lock
prefix on a single-core CPU. There aren't any other cores that another thread could be running on.
So it's similar to the case of signals: a signal handler runs instead of the normal execution of the thread handling the signal, so it can't be handled in the middle of one instruction.
data
tovolatile struct two_int { int a, b; } data;
and try it again. FWIW your code works as intended both with and without thevolatile
for me at Online GDB – Whimwham