C11 memory model -- two threads execute atomic_fetch_add followed by atomic_load -- what output is possible?
Asked Answered
F

1

4

Consider the following test program, compiled and run on an implementation that fully implements C2011 atomics and threads.

#include <stdio.h>
#include <stdatomic.h>
#include <threads.h>

#if !ATOMIC_INT_LOCK_FREE
#error "Program behavior is only interesting if atomic_int is lock-free."
#endif

static atomic_int var;

int inc_print(void *unused)
{
   atomic_fetch_add(&var, 1);
   printf(" %d", atomic_load(&var));
   return 0;
}

int main(void)
{
    thrd_t thrd;
    if (thrd_create(&thrd, inc_print, 0) == thrd_success) {
        inc_print(0);
        thrd_join(&thrd, 0);
    }
    putchar('\n');
    return 0;
}

I have managed to convince myself that all of the following statements are true:

  1. Each thread's atomic_load must observe the increment performed by that same thread, so it cannot read a zero.
  2. Each thread's atomic_load may or may not observe the increment performed by the other thread. (The other thread might not get scheduled at all until after the atomic_load.) Therefore, it can read either a 1 or a 2.
  3. The calls to printf are serialized only against each other. Therefore, if one thread's atomic_load reads a 1 and the other thread's atomic_load reads a 2, either 1 2 or 2 1 may be printed.
  4. It is possible for both atomic_loads to observe the increment performed by the other thread, so the output 2 2 is also possible.

What I'm not sure of, though: Is it possible for neither of the atomic_loads to observe the increment performed by the other thread? That is, is the output 1 1 possible?

Also, does relaxing the memory model change anything?

Fariss answered 2/8, 2022 at 19:6 Comment(0)
E
3

Your conclusions look correct to me.

The default memory_order_seq_cst guarantees this whole program executes in a sequentially consistent manner, since it's data-race free and doesn't use any non-SC atomics. So the possible results are only interleavings of program-order.

This allows both increments then both loads, but one increment must come after the other, seeing its 1 result and writing a 2. And the load in that thread must come after it, so at least one thread sees a 2. The 1 1 result is impossible, the 2 2 result can happen.


Relaxed atomics don't introduce any new possibilities here; we can obtain the same guarantees from the rules for operations on the same atomic variable that apply regardless of the memory_order parameter.

  • A consistent modification order for var exists, and a total of two atomic increments must do a total of +=2. (Atomic RMWs are guaranteed to read the latest value for this reason, to make sure RMWs on the same object are serialized with each other, not both loading a 0 and writing a 1. That wouldn't be an atomic increment.)
  • A load after a fetch_add in the same thread must see its result or some later value in the modification order. (In C++ this is the write-read coherence guarantee, and sequenced-before (in the same thread) forcing ordering between two operations on the same atomic object. edits welcome with a link to the equivalent language in the C11 standard.)

And yes, the printfs are ordered independently of the atomic modifications to var. It effectively locks stdout. If that works like the rules for C11 mtx_lock, that's like an acquire operation on the mutex, so it can be starting to take the lock before the increment or load complete.

Not that that's relevant; it's not required for the 2 1 output to be possible. Even if locking was seq_cst, you could get to a state where neither printf had started, but all the atomics had finished. With one thread having a 1 and the other a 2 as their temporaries. Then it's just chance which gets to print first.

Eeg answered 2/8, 2022 at 20:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.