Understanding CLR 2.0 Memory Model
Asked Answered
P

1

14

Joe Duffy, gives 6 rules that describe the CLR 2.0+ memory model (it's actual implementation, not any ECMA standard) I'm writing down my attempt at figuring this out, mostly as a way of rubber ducking, but if I make a mistake in my logic, at least someone here will be able to catch it before it causes me grief.

  • Rule 1: Data dependence among loads and stores is never violated.
  • Rule 2: All stores have release semantics, i.e. no load or store may move after one.
  • Rule 3: All volatile loads are acquire, i.e. no load or store may move before one.
  • Rule 4: No loads and stores may ever cross a full-barrier (e.g. Thread.MemoryBarrier, lock acquire, Interlocked.Exchange, Interlocked.CompareExchange, etc.).
  • Rule 5: Loads and stores to the heap may never be introduced.
  • Rule 6: Loads and stores may only be deleted when coalescing adjacent loads and stores from/to the same location.

I'm attempting to understand these rules.

x = y
y = 0 // Cannot move before the previous line according to Rule 1.

x = y
z = 0
// equates to this sequence of loads and stores before possible re-ordering
load y
store x
load 0
store z

Looking at this, it appears that the load 0 can be moved up to before load y, but the stores may not be re-ordered at all. Therefore, if a thread sees z == 0, then it also will see x == y.

If y was volatile, then load 0 could not move before load y, otherwise it may. Volatile stores don't seem to have any special properties, no stores can be re-ordered with respect to each other (which is a very strong guarantee!)

Full barriers are like a line in the sand which loads and stores can not be moved over.

No idea what rule 5 means.

I guess rule 6 means if you do:

x = y
x = z

Then it is possible for the CLR to delete both the load to y and the first store to x.

x = y
z = y
// equates to this sequence of loads and stores before possible re-ordering
load y
store x
load y
store z
// could be re-ordered like this
load y
load y
store x
store z
// rule 6 applied means this is possible?
load y
store x // but don't pop y from stack (or first duplicate item on top of stack)
store z

What if y was volatile? I don't see anything in the rules that prohibits the above optimization from being carried out. This does not violate double-checked locking, because the lock() between the two identical conditions prevents the loads from being moved into adjacent positions, and according to rule 6, that's the only time they can be eliminated.

So I think I understand all but rule 5, here. Anyone want to enlighten me (or correct me or add something to any of the above?)

Priapism answered 31/5, 2010 at 3:39 Comment(0)
O
11

Joe Duffy discusses Rule 5 on pp517-18 of Concurrent Programming on Windows:

As an example of when a load might be introduced, consider this code:

MyObject mo = ...;
int f = mo.field;
if (f == 0)
{
    // do something
    Console.WriteLine(f);
}

If the period of time between the initial read of mo.field into variable f and the subsequent use of f in the Console.WriteLine was long enough, a compiler may decide it would be more efficient to reread mo.field twice. ... Doing this would be a problem if mo is a heap object and threads are writing concurrently to mo.field. The if-block may contain code that assumes the value read into f remained 0, and the introduction of reads could break this assumption. In addition to prohibiting this for volatile variables, the .NET memory model prohibits it for ordinary variables referring to GC heap memory too.

I blogged about one important place where this matters: the standard pattern for raising an event.

EventHandler handler = MyEvent;
if (handler != null)
    handler(this, EventArgs.Empty);

In order to prevent problems with removing an event handler on a separate thread, we read the current value of MyEvent and only invoke the event handlers if that delegate is non-null.

If reads from the heap could be introduced, the compiler/JIT might decide that it could be better to read MyEvent again, rather than using the local, which would introduce a race condition.

Ovolo answered 31/5, 2010 at 4:13 Comment(4)
Nice explanation! That explains why you wouldn't want the CLR introducing loads. I can't think of a place where the compiler/JIT might ever want to introduce stores though, can you?Priapism
@Eloff: Some compilers might rewrite "if (cond) { x = y; }" as "x = y; if (!cond) { x = old_x; }" if they thought it would be better for branch prediction. While there's no observable difference in a single-threaded scenario, this obviously has bad consequences if 'x' is visible to multiple threads, so the CLR Memory Model prohibits this.Ovolo
Ouch, that would make low-lock programming pretty impossible. Ditto for introducing loads. So rule #5 is pretty critical for sane multi-threading.Priapism
If foo is a reference to an object with an integer field "bar", and I code "intVar = foo.bar", is there a guarantee in the absence of a memory barrier the field will be read after the reference? For example, if foo points to an object where bar==5, and some other code sets some other object's bar to 5 and makes foo point to that latter object, is it guaranteed that foo.bar will see 5 and not the new object's old value?Schizomycete

© 2022 - 2024 — McMap. All rights reserved.