They don't in themselves do anything to solve concurrency but they do stop the compiler doing silly things like loading a value from the same memory location twice. This is important for example if you are accessing HW and don't want to trigger multiple bus accesses, potentially affecting future reads and writes.
Compilers will do this sort of thing because generally they are allowed to optimise access to aliased variables because they think they know how the whole system behaves.
To truly support concurrency you need reason about memory consistency and what values can be guaranteed to visible to one thread if another value is visible. This include declaring operations to be atomic (to avoid "tears" by reading the value in smaller parts and combining the result) and specifying memory barriers. Memory barriers allow you to ensure that values protected by another field will be visible to the other thread when accessed.
volatile
read or write alone is similar to C _Atomic
load or store with memory_order_relaxed
, and usually compiles to the same asm. See When to use volatile with multi threading? (never, except in the Linux kernel) for some of the low-level details of why this is true.
On ISAs like ARMv8 that have load-acquire and store-release instructions, you'd prefer to use those (via smp_load_acquire
/ smp_store_release
) instead of a volatile
READ_ONCE
and a separate barrier. Most older ISAs just had plain loads and separate barrier instructions. The kernel's READ_ONCE
/ WRITE_ONCE
model is designed around that.
p
and of the result. This is completely unrelated to the other part of the question. (Prefer to ask a single question in the question post). – ApoliticalREAD_ONCE
usessmp_read_barrier_depends()
which is non-empty only on Alpha. Implementation ofWRITE_ONCE
doesn't use CPU barriers at all. – Apolitical__read_once_size
and__read_once_size
actually usememcpy
as a fallback, but thatmemcpy
is wrapped withbarrier()
calls. That calls provide compiler barrier only and has nothing common with the processor's cache coherency. – Apoliticalvolatile
is about equivalent toatomic_load_explicit(&var, memory_order_relaxed)
. See my answer on When to use volatile with multi threading? – Caterina