ARMv8.3 meaning of rcpc

With ARMv8.3 a new instruction has been introduced: LDAPR.

When there is a STLR followed by a LDAR to a different address, then these 2 can't be reordered and hence it is called RCsc (release consistent sequential consistent).

When there is a STLR followed by a LDAPR to a different address, then these 2 can be reordered. This is called RCpc (release consistent processor consistent).

My issue is with the PC part.

PC is a relaxation of TSO whereby TSO is multi-copy atomic and PC is non multi-copy atomic.

The memory model of ARMv8 has been improved to be multi-copy atomic because no supplier ever created a non multi-copy atomic microarchitecture and it made the memory model more complicated.

So I'm running into a contradiction.

The key question is: is every store (including relaxed) multi-copy atomic?

If so, then the PC part of rcpc doesn't make sense to me since PC is non multi-copy atomic. Could it be a legacy name due to ARM being non multi-copy atomic in the past?

There are multiple definitions of PC; so perhaps that is the cause.

In practice, STLR / LDAPR gives C++ std::memory_order_acq_rel, as opposed to SC.

So "processor consistent" is presumably describing the fact that the current core sees its own operations in program order, and as a way to note that it's not sequentially consistent because they don't use that term. It doesn't mean that other parts of the memory model rules are removed.

AFAIK, yes, ARMv8 is multi-copy atomic, so every plain store (str, stp, etc.) is multi-copy atomic. i.e. It becomes visible to all other cores at the same time via coherent cache, so all threads can agree on the order of two stores done by two independent writers (the IRIW litmus test). Unlike POWER where some threads can see stores early from other SMT threads on the same physical core.

I don't think LDAPR relaxed that guarantee.

(ARMv7 did not have this property on paper, but all real-world implementations did. So ARM was able the strengthen their guarantees without actually changing how anything worked in any real ARM microarchitectures, beyond adding support for ARMv8 32-bit mode new instructions. "Shared Memory Consistency Models: A Tutorial" from 1995, linked in comments, uses the term RCpc to describe a category of memory models that does include some readers being able to see some stores before other readers, allowing IRIW. So it seems either ARMv8 is using a different meaning, or other requirements still come into play to forbid IRIW reordering.)

Big caveat: I'm not a terminology expert on this, and I've never heard of "processor consistent" before so I'm just guessing from context what they mean by it, with an interpretation that would be consistent with all known facts. Please correct me if this is incompatible with an accepted definition of the term.

Recommended topics

Hot tags