VarHandle get/setOpaque
Asked Answered
W

3

15

I keep fighting to understand what VarHandle::setOpaque and VarHandle::getOpaque are really doing. It has not been easy so far - there are some things I think I get (but will not present them in the question itself, not to muddy the waters), but overall this is miss-leading at best for me.

The documentation:

Returns the value of a variable, accessed in program order...

Well in my understanding if I have:

int xx = x; // read x
int yy = y; // read y

These reads can be re-ordered. On the other had if I have:

// simplified code, does not compile, but reads happen on the same "this" for example
int xx = VarHandle_X.getOpaque(x); 
int yy = VarHandle_Y.getOpaque(y);

This time re-orderings are not possible? And this is what it means "program order"? Are we talking about insertions of barriers here for this re-ordering to be prohibited? If so, since these are two loads, would the same be achieved? via:

 int xx = x;
 VarHandle.loadLoadFence()
 int yy = y;

But it gets a lot trickier:

... but with no assurance of memory ordering effects with respect to other threads.

I could not come up with an example to even pretend I understand this part.

It seems to me that this documentation is targeted at people who know exactly what they are doing (and I am definitely not one)... So can someone shed some light here?

Wheels answered 28/5, 2019 at 12:19 Comment(0)
G
17

Well in my understanding if I have:

int xx = x; // read x
int yy = y; // read y

These reads can be re-ordered.

These reads may not only happen to be reordered, they may not happen at all. The thread may use an old, previously read value for x and/or y or values it did previously write to these variables whereas, in fact, the write may not have been performed yet, so the “reading thread” may use values, no other thread may know of and are not in the heap memory at that time (and probably never will).

On the other had if I have:

// simplified code, does not compile, but reads happen on the same "this" for example
int xx = VarHandle_X.getOpaque(x); 
int yy = VarHandle_Y.getOpaque(y);

This time re-orderings are not possible? And this is what it means "program order"?

Simply said, the main feature of opaque reads and writes, is, that they will actually happen. This implies that they can not be reordered in respect to other memory access of at least the same strength, but that has no impact for ordinary reads and writes.

The term program order is defined by the JLS:

… the program order of t is a total order that reflects the order in which these actions would be performed according to the intra-thread semantics of t.

That’s the evaluation order specified for expressions and statements. The order in which we perceive the effects, as long as only a single thread is involved.

Are we talking about insertions of barriers here for this re-ordering to be prohibited?

No, there is no barrier involved, which might be the intention behind the phrase “…but with no assurance of memory ordering effects with respect to other threads”.

Perhaps, we could say that opaque access works a bit like volatile was before Java 5, enforcing read access to see the most recent heap memory value (which makes only sense if the writing end also uses opaque or an even stronger mode), but with no effect on other reads or writes.

So what can you do with it?

A typical use case would be a cancellation or interruption flag that is not supposed to establish a happens-before relationship. Often, the stopped background task has no interest in perceiving actions made by the stopping task prior to signalling, but will just end its own activity. So writing and reading the flag with opaque mode would be sufficient to ensure that the signal is eventually noticed (unlike the normal access mode), but without any additional negative impact on the performance.

Likewise, a background task could write progress updates, like a percentage number, which the reporting (UI) thread is supposed to notice timely, while no happens-before relationship is required before the publication of the final result.

It’s also useful if you just want atomic access for long and double, without any other impact.

Since truly immutable objects using final fields are immune to data races, you can use opaque modes for timely publishing immutable objects, without the broader effect of release/acquire mode publishing.

A special case would be periodically checking a status for an expected value update and once available, querying the value with a stronger mode (or executing the matching fence instruction explicitly). In principle, a happens-before relationship can only be established between the write and its subsequent read anyway, but since optimizers usually don’t have the horizon to identify such a inter-thread use case, performance critical code can use opaque access to optimize such scenario.

Giese answered 28/5, 2019 at 14:46 Comment(7)
This implies that they can not be reordered in respect to other memory access of at least the same strength - well isn't that exactly the point, that if reads or x and y do happen via the VarHandle they can not be re-ordered and will be read in program order by a reading thread? that to me implies that if a writer thread writes x = 1 and y = 1 and a reader thread observes yy == 1, it must also observe xx = 1 - since they can't be reordered.Wheels
Well, we had that point recently. If one thread reads x and y while another thread writes x and y, the reader may read a newer value for y than for x, even without reordering. If you perform opaque reads of y and x while another thread does opaque writes of x and y, both in that order, reading the new value for y implies that you also must read the new value for x subsequently, but all that only for access mode opaque or even stronger.Giese
I think I understand your point. It seems this is not about barriers, but lack of any optimizations around an opaque. Basically the hardware will respect whatever the opaque is doing - be that a read or a write, unlike a plain mode. For stronger than opaque modes, this also adds release-acquire/sequential consistency semantics, opaque does not; this also means that opaque can not be re-ordered between them (program order... )Wheels
Not in a perceivable way. But imagine an architecture allowing to perform multiple opaque write operations atomically. The, the order in which this instruction performs the writes, is irrelevant, as no thread could read in-between and notice the order. We could also play devil's advocate here and just stop all other threads for some time. Within that time span, the writes could be performed in arbitrary order or elided like ordinary writes...Giese
Opaque loads/stores to different addresses can be reordered. So no ordering guarantees are provided with opaque. Perhaps the compiler might not touch an opaque load/store; it doesn't mean that the hardware needs to respect it.Spunky
@Spunky it would be great if Java finally gets an updated specification covering these memory modes. For the formally defined behavior, it doesn’t matter whether the compiler or the hardware reorders accesses.Giese
@Giese yes, I would like to see an update to the memory model. Unfortunately, JEP 188 has stalled. openjdk.org/jeps/188Spunky
C
5

The opaque means that the thread executing opaque operation is guaranteed to observe its own actions in program order, but that's it.

Other threads are free to observe the threads actions in any order. On x86 it is a common case since it has

write ordered with store-buffer forwarding

memory model so even if the thread does store before load. The store can be cached in the store buffer and some thread being executed on any other core observes the thread action in reverse order load-store instead of store-load. So opaque operation is done on x86 for free (on x86 we actually also have acquire for free, see this extremely exhaustive answer for details on some other architectures and their memory models: https://mcmap.net/q/20644/-memory-order-consume-usage-in-c11)

Why is it useful? Well, I could speculate that if some thread observed a value stored with opaque memory semantic then subsequent read will observe "at least this or later" value (plain memory access does not provide such guarantees, does it?).

Also since Java 9 VarHandles are somewhat related to acquire/release/consume semantic in C I think it is worth noting that opaque access is similar to memory_order_relaxed which is defined in the Standard as follows:

For memory_order_relaxed, no operation orders memory.

with some examples provided.

Clorindaclorinde answered 5/6, 2019 at 2:10 Comment(5)
Even with plain memory access, the CPU should not see the consequences of local reordering of loads/stores/Spunky
@Spunky if you're talking about the same CPU doing stores/loads then I've not heard about architecture with memory model that would require extra memory barrier to observe changes doing by the same cpu. The exception is NON-TEMPORAL stores since they bypass regular cache coherence mechanism and require explicit sfence to flush WC-pending writes.Clorindaclorinde
@Spunky To my knowledge, JVM does not currently provides a way to control caching in any way without explicitly writing assembly/compiler intrinsics (at least on x86).Clorindaclorinde
the point is that a CPU should not observe its own loads/stores being reordered no matter the access mode.Spunky
So "the thread executing opaque operation is guaranteed to observe its own actions in program order, but that's it. " applies to any access mode.Spunky
S
4

I have been struggling with opaque myself and the documentation is certainly not easy to understand.

From the above link:

Opaque operations are bitwise atomic and coherently ordered.

The bitwise atomic part is obvious. Coherently ordered means that loads/stores to a single address have some total order, each reach sees the most recent address before it and the order is consistent with the program order. For some coherence examples, see the following JCStress test.

Coherence doesn't provide any ordering guarantees between loads/stores to different addresses so it doesn't need to provide any fences so that loads/stores to different addresses are ordered.

With opaque, the compiler will emit the loads/stores as it sees them. But the underlying hardware is still allowed to reorder load/stores to different addresses.

I upgraded your example to the message-passing litmus test:

thread1:
X.setOpaque(1);
Y.setOpaque(1);

thread2:
ry = Y.getOpaque();
rx = X.getOpaque();
if (ry == 1 && rx == 0) println("Oh shit");

The above could fail on a platform that would allow for the 2 stores to be reordered or the 2 loads (again ARM or PowerPC). Opaque is not required to provide causality. JCStress has a good example for that as well.

Also, the following IRIW example can fail:

thread1:
X.setOpaque(1);

thread2:
Y.setOpaque(1);

thread3:
rx_thread3 = X.getOpaque();
[LoadLoad]
ry_thread3 = Y.getOpaque();

thread4:
ry_thread4 = Y.getOpaque();
[LoadLoad]
rx_thread4 = X.getOpaque();

Can it be that we end up with rx_thread3=1,ry_thread3=0,ry_thread4=1 and rx_thread4 is 0?

With opaque this can happen. Even though the loads are prevented from being reordered, opaque accesses do not require multi-copy-atomicity (stores to different addresses issued by different CPUs can be seen in different orders).

Release/acquire is stronger than opaque, since with release/acquire it is allowed to fail, therefor with opaque, it is allowed to fail. So Opaque is not required to provide consensus.

Spunky answered 8/8, 2022 at 0:39 Comment(1)
I am not ignoring this answer btw, I just want to get in the proper mood to be able to fully get it, and that takes a while... thank you for taking the time!Wheels

© 2022 - 2024 — McMap. All rights reserved.