If I have atomic_bool flag;
, how can I write C code to toggle it that's atomic, portable, and efficient? Regarding "efficient", I'd like it to assemble on x86_64 to lock xorb $1, flag(%rip)
. The "obvious" flag = !flag;
is out because it isn't actually atomic. My next guess would be flag ^= true;
, which assembled to this mess on GCC:
movzbl flag(%rip), %eax
0:
movb %al, -1(%rsp)
xorl $1, %eax
movl %eax, %edx
movzbl -1(%rsp), %eax
lock cmpxchgb %dl, flag(%rip)
jne 0b
And this mess on Clang:
movb flag(%rip), %al
0:
andb $1, %al
movl %eax, %ecx
xorb $1, %cl
lock cmpxchgb %cl, flag(%rip)
jne 0b
Then I tried specifying a weaker memory order by doing atomic_fetch_xor_explicit(&flag, true, memory_order_acq_rel);
instead. This does what I want on Clang, but GCC now completely fails to compile it with error: operand type '_Atomic atomic_bool *' {aka '_Atomic _Bool *'} is incompatible with argument 1 of '__atomic_fetch_xor'
. Interestingly, if my type is an atomic_char
instead of an atomic_bool
, then both GCC and Clang emit the assembly that I want. Is there a way to do what I want with atomic_bool
?
atomic_bool
. (I wonder if the "None of these operations is applicable to atomic_bool" part from the standard suggest that compilers should refuse to compile such code, making clang non-conformant here.) – Magentaatomic_char flag;
? – Ebsentypedef unsigned char mybool_t;
and then you could use it with the gcc intrinsics. It's not ideal - having to create another "bool" type - but it could be a good enough workaround (At least, tThat's what I did when I had this problem in the past :) – Vanhorn_Bool
isn't any safer than usingchar
when it comes to "someone setting it to a value other than 0 or 1", just useatomic_char
, and if it is a global, then provideset
/get
functions that accept a bool and write to a char. If you really think that it is a hack, remember that C bool type is just a hacked in typedef for_Bool
, whatever that is, which is a hack that is here simply to keep backward compatibility with code that may have used "bool" with assumption that it isn't a keyword, and there's no point in worrying about such things anyway. – Alphonsoalphonsus_Bool flag = 2;
won't actually store a 2, butchar flag = 2;
will, so I disagree that_Bool
isn't any safer. – Preraphaelite2
when there's no valid boolean value that can be mapped from2
is like worrying about someone passing aNULL
to api that clearly states it an undefined behaviour to do so. This is just how it is in C and it is too late to change that now. – Alphonsoalphonsusbool foo = return_zero_or_nonzero();
is an error, and you should have writtenbool foo = (f() != 0);
to explicitly booleanize, rather than rely on implicit conversion to bool? That's one style choice, but withunsigned char
the compiler isn't going to warn you if you get it wrong. – Kratzerflag ^= 1;
, that's a missed optimization on their part, and should get reported (github.com/llvm/llvm-project/issues and gcc.gnu.org/bugzilla/enter_bug.cgi?product=gcc). If the return value is unused, yes,lock xorb
is optimal. And if it is used,lock btc $0, flag(%rip)
. – Kratzermemory_order_seq_cst
(which is what the standard requires for any operation where you don't specify an explicit one, even though I don't need it)? – Preraphaelitelock xorb
is a full barrier and more than sufficient for a seq_cst RMW, just likelock addl
is safe foratomic_fetch_add
. (Orlock xaddl
if the return value is used.) x86 can't do atomic RMWs with anything less than a full memory barrier, soatomic_fetch_add_explicit
for a relaxed integer add only allows compile-time reordering, stilllock add
orlock xadd
in the asm, same as seq_cst. See The strong-ness of x86 store instruction wrt. SC-DRF? re:xchg
or other locked insn being as strong as a full SC fence. – Kratzeratomic_fetch_xor
and friends onatomic_bool
, whereas AFAICT it does require supporting^=
. So it really doesn't save anything for the implementation, except that it deprives the programmer of their choice of memory ordering. – Benderflag ^= 1
(or equivalents likeflag -= 1
), and gcc optimizes them poorly. You can get the better-optimized version, at cost of portability, withatomic_fetch_xor((atomic_uchar *)&flag, 1)
. Wrap in ifdef as needed. Do you want that in an answer, or are you holding out for a new idea out of left field? – Bender^=
compiles to, as @PeterCordes mentioned.) – Preraphaelitelock xor
a bitfield when different threads xor different bits. But if you letN
threads atomically xor the same boolean flag won't the result just beN % 2
? If that's the case, wouldn't be easier to useatomic_add
and extract the LSb when needed? If the threads compete to set/reset the flag, wouldn't you need a form of synchronization to make sure that two threads trying to set the flag won't end up setting and resetting it? In that case you wouldnt need an atomic – Brachiate