READ_ONCE and WRITE_ONCE in Parallel programming
Asked Answered
A

1

6

In the book "Is Parallel Programming Hard, And, If So,What Can You Do About It?", the author uses several macros that I don't understand what they actually do.

#define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x))

#define READ_ONCE(x) \
({ typeof(x) ___x = ACCESS_ONCE(x); ___x; })

#define WRITE_ONCE(x, val) \
do { ACCESS_ONCE(x) = (val); } while (0)

I don't understand what ACCESS_ONCE macro does and why it needs to cast and de-reference from and to an object of type volatile pointer.

and what is the usage of __x at the end of READ_ONCE macro?

In the following also there are some usages of these macros that (again) i don't understand.

Here is a list of situations allowing plain loads and stores for some accesses to a given variable, while requiring markings (such as READ_ONCE() and WRITE_ONCE()) for other accesses to that same variable:

  1. A shared variable is only modified by a given owning CPU or thread, but is read by other CPUs or threads. All stores must use WRITE_ONCE(). The owning CPU or thread may use plain loads. Everything else must use READ_ONCE() for loads.
  2. A shared variable is only modified while holding a given lock, but is read by code not holding that lock. All stores must use WRITE_ONCE(). CPUs or threads holding the lock may use plain loads. Everything else must use READ_ONCE() for loads.
  3. A shared variable is only modified while holding a given lock by a given owning CPU or thread, but is read by other CPUs or threads or by code not holding that lock. All stores must use WRITE_ONCE(). The owning CPU or thread may use plain loads, as may any CPU or thread holding the lock. Everything else must use READ_ONCE() for loads.
  4. A shared variable is only accessed by a given CPU or thread and by a signal or interrupt handler running in that CPU’s or thread’s context. The handler can use plain loads and stores, as can any code that has prevented the handler from being invoked, that is, code that has blocked signals and/or interrupts. All other code must use READ_ONCE() and WRITE_ONCE().
  5. A shared variable is only accessed by a given CPU or thread and by a signal or interrupt handler running in that CPU’s or thread’s context, and the handler always restores the values of any variables that it has written before return. The handler can use plain loads and stores, as can any code that has prevented the handler from being invoked, that is, code that has blocked signals and/or interrupts. All other code can use plain loads, but must use WRITE_ONCE() to prevent store tearing, store fusing, and invented stores.

First of all how can we use these macros to have simultaneous access to memory? AFAIK the volatile keyword is not safe for concurrent memory access.

In the item number 1, how can we use READ_ONCE and WRITE_ONCE to access a shared variable without data race?

And in item number 2, why does he use WRITE_ONCE macro when write is only allowed by holding lock. and why doesn't read need to hold the lock?

Algo answered 5/9, 2021 at 20:0 Comment(17)
Wow those macros are awful. How old is the book? The use of volatile does nothing to prevent race conditions and Undefined Behaviour when sharing data across threads. You should probably be using the standard <atomic> library instead.Bridewell
the book is new and updated until now in 2021.mirrors.edge.kernel.org/pub/linux/kernel/people/paulmck/…Algo
Wow. Volatile has absolutely nothing to do with multithreading. Prestandard threading in c++ (which was C++11, over a decade ago!) one used to attempt to use volatile to do this as we didn't have anything better. For the last decade we've known better and use standard mechanisms.Oliana
Also: c.isvolatileusefulwiththreads.comOliana
@MikeVine i know volatile is useless, but he use these macros for concurrent access to memory without any locking mechanism and exactly this is what i can't undestand.Algo
volatile is neither necessary nor sufficient to do anything with multithreading nor concurrent access. It exists to interact with hardware - like writing to external registers - and unless the book is using it for this specific mechanism (which seems unlikely) the book is wrong.Oliana
For example point 2 makes no sense. In C++ it is undefined behavoir to do exactly what they are saying. Reading a (non std::atomic) variable outside of a lock whilst another thread is writing it is broken, no matter how many useless macros are used.Oliana
i don't think the book is wrong, because the macros defined in linux kernel.Algo
@Algo they are wrong in context. If you have an environment which can guarantee those macros work then they can work - and presumably the linux kernel in one such environment. I can make up any rules for something which works and create an environment in which they do. That doesn't change the fact in standard c++ they're wrong. Edit; Apologies I didn't see that the linux-kernel was tagged. In which case you need to look up the multithreaded requirements of that environment.Oliana
READ_ONCE and WRITE_ONCE are usable for C89 standard, where atomic doesn't exist even as a compiler's extension. These macros provide facilities similar to ones, which are provided by memory_order_relaxed loads and stores in C11. Their implementation via volitile is correct for gcc compiler, which is the only one supported for Linux kernel.Echino
@Echino atomicity is not guaranteed by these macro especially for composite types bigger than a cache line size (at least on x86-64) and probably even types bigger than the biggest native type size (typically from 4 up to 32 bytes on most common architectures). So using it for a portable architecture-independent code with unspecified type limitations looks like a bad idea.Gemmagemmate
___x is reserved for use by the implementation. It has no business being in that code.Unmannerly
This book is about C programming, not C++ - so do not expect to get good C++ practices from there. The code you provided is from Atomic Operations (GCC Classic) chapter - basically low-level workarounds for not provided builtin functionality (the way they implement them in Linux kernel), and the next chapter Atomic Operations (C11) explains there are builtins provided by modern compiler.Civilian
For C "...Note that volatile variables are not suitable for communication between threads; they do not offer atomicity, synchronization, or memory ordering. A read from a volatile variable that is modified by another thread without synchronization or concurrent modification from two unsynchronized threads is undefined behavior due to a data race...." en.cppreference.com/w/c/language/volatileBelding
For C++ "...This makes volatile objects suitable for communication with a signal handler, but not with another thread of execution, see std::memory_order). ...." en.cppreference.com/w/cpp/language/cvBelding
@PeteBecker They probably chose to use ___x so that it wouldn't clash with the parameter. Also, the Linux kernel from which those macros are taken does not use the standard library, and those macros are part of the kernel's "library".Crocket
@IanAbbott -- if this question is about one-off code with a on-off compiler, that's all the more reason to ignore it. There's nothing useful to be gained by discussing it here.Unmannerly
D
9

These macros are ways to enforce some level of atomicy (but no synchronization) on supporting compilers (GCC, maybe some others). They are used heavily inside Linux as it predates C11 by a huge margin.

In GCC semantics, volatile results in emitting exactly one instruction accessing the pointed-to value (at least if that value is word-sized). On all architectures Linux supports, aligned word-sized accesses are atomic so the overall construction results in one single atomic access. (I mean machine word ofc, not the WORD type on some well-known platform).

To my knowledge, that is equivalent to using C++ atomic with memory_order_relaxed (as was pointed in comments) with the exception that those require every access to be atomic while that’s not actually required in some patterns (like, if only one thread writes to a variable, its own reads don’t need to be atomic, and atomic read-write operations are definitely unnecessary in such case).

Dogs answered 5/9, 2021 at 22:15 Comment(1)
C++20 finally added the ability to do an atomic access to a plain variable, with std::atomic_ref. Also, see When to use volatile with multi threading? for why volatile works in practice pretty much like relaxed atomic accesses on real hardware (because coherent caches are universally used), despite ISO C still considering it data-race UB. One difference is that volatile stops compile-time reordering even for accesses to different variables.Chile

© 2022 - 2024 — McMap. All rights reserved.