What is meant by FENCE.TSO in the RISC-V ISA?
Asked Answered
T

1

8

I don't really understand the difference between a normal FENCE in RISC-V (has been answered here: What is meant by the FENCE instruction in the RISC-V instruction set?) and the FENCE.TSO. The manual says:

The optional FENCE.TSO instruction is encoded as a FENCE instruction with fm=1000, predecessor=RW, and successor=RW. FENCE.TSO orders all load operations in its predecessor set before all memory operations in its successor set, and all store operations in its predecessor set before all store operations in its successor set.This leaves non-AMO store operations in the FENCE.TSO’s predecessor set unordered with non-AMO loads in its successor set.

Okay, So here is my guess. I will just show my sketch from what I understood.

There are two sets (including instructions), which are being separated by the FENCE instruction, namely predecessor set and successor set.

Load Operation 1
Load Operation 2
Load Operation 3
Store Operation 1
Store Operation 2
Store Operation 3
**FENCE.TSO**
Memory Operation 1
Memory Operation 2
Memory Operation 3
Store Operation 4
Store Operation 5
Store Operation 6

This is how I understand it. But I'm still confused by the sentence This leaves non-AMO store operations in the FENCE.TSO’s predecessor set unordered with non-AMO loads in its successor set. What are non-AMO loads and non-AMO store operations?

Alright, AMO seems to stand for "Atomic Memory Operation". Still I'm wondering, why I can't just use the "normal" FENCE.

Tinsmith answered 22/6, 2019 at 16:50 Comment(1)
AMO = Atomic Memory Operation. Not an answer, but stores in the predecessor are only ordered against stores in the successor, so I assume a load in the successor might move to before a store in the predecessor (but after any loads). AMO have their own ordering constraints (acquire/release/seq_cst), which might imply ordering against the predecessor's loads.Gillan
P
6

You can use the "normal" FENCE, since it orders operations more strictly than FENCE.TSO does. This can be inferred from the note about backward compatibility with implementations that don't support the optional .TSO extension:

The FENCE.TSO encoding was added as an optional extension to the original base FENCE instruction encoding. The base definition requires that implementations ignore any set bits and treat the FENCE as global, and so this is a backwards-compatible extension.

So, what is the difference between a FENCE RW,RW and a FENCE.TSO RW,RW? Let's take a simple example.

load A
store B
<fence>
load C
store D

When <fence> is FENCE RW,RW, the following rules apply:

A < C
A < D
B < C
B < D

This results in four different possible orders: ABCD, BACD, ABDC, and BADC. In other words, A/B may be reordered, and C/D may be reordered, but both A and B must be observable no later than C and D.

When <fence> is FENCE.TSO RW,RW, the following rules apply:

A < C
A < D
B < D

Note how B < C is missing; FENCE.TSO does not impose any order between predecessor stores and sucessor loads. Presumably, this weaker ordering makes it cheaper than a "normal" FENCE.

This gives us five possible orders: ABCD, BACD, ABDC, BADC, and ACBD. If this is acceptable to your program, you may use FENCE.TSO.

Plantar answered 14/1, 2020 at 20:53 Comment(2)
Is FENCE R, RW; FENCE W, W equivalent to FENCE.TSO RW, RW?Litchi
In my understanding, it's functionally equivalent, yes. However, it may be less performant. At the very least, it's an extra instruction that takes up room in the instruction cache and pipeline. I'm not qualified to say if execution of the second fence would be as long as the first one or if it could be streamlined in some way. If it's even possible, it would differ between implementations.Plantar

© 2022 - 2024 — McMap. All rights reserved.