C++ memory model - does this example contain a data race?
Asked Answered
A

5

12

I was reading Bjarne Stroustrup's C++11 FAQ and I'm having trouble understanding an example in the memory model section.

He gives the following code snippet:

// start with x==0 and y==0
if (x) y = 1; // thread 1
if (y) x = 1; // thread 2

The FAQ says there is not a data race here. I don't understand. The memory location x is read by thread 1 and written to by thread 2 without any synchronization (and the same goes for y). That's two accesses, one of which is a write. Isn't that the definition of a data race?

Further, it says that "every current C++ compiler (that I know of) gives the one right answer." What is this one right answer? Couldn't the answer vary depending on whether one thread's comparison happens before or after the other thread's write (or if the other thread's write is even visible to the reading thread)?

Anthropomorphism answered 1/1, 2014 at 17:11 Comment(0)
B
15
// start with x==0 and y==0
if (x) y = 1; // thread 1
if (y) x = 1; // thread 2

Since neither x nor y is true, the other won't be set to true either. No matter the order the instructions are executed, the (correct) result is always x remains 0, y remains 0.

Bestow answered 1/1, 2014 at 17:18 Comment(6)
This is a tough one :DForworn
I guess one of the variables can change at another place. ^^Blintz
Seems obvious now. I over-thought this one.Anthropomorphism
This (accepted) answer is wrong. This is not a data race because with that notation the memory accesses are sequential-consistent and therefore there IS a synchronization mechanism. The outcome of x=1 and y=1 however is not legal because that would fail the svr-pes20-cppmem.cl.cam.ac.uk/cppmem/… predicate, meaning it is not possible for the write to x to be a visible side effect for the read to x and the write to y to be a visible side effect to the read of y at the same time, because at least one read has to happen before the its corresponding writeHogweed
If you'd change the code to if (x.load(std::memory_order_relaxed)), etc doing all loads and stores relaxed, then this code DOES have data races because in that case there is indeed no synchronization mechanism.Hogweed
It's easy in this case (having the required knowledge anyway). Thread 1 reads x, and thread 2 writes to x. There is no synchronization mechanism in place (mutex, or sequential consistency), hence there is a data race. Most notably, it doesn't matter that there is an if () for this reasoning, at all. The reasoning is the same with or without if ()'s and it doesn't matter what is being tested (well, except a if (0) perhaps, because then the compiler will just not generate the if body at all).Hogweed
I
7

The memory location x is ... written to by thread 2

Is it really? Why do you say so?

If y is 0 then x is not written to by thread 2. And y starts out 0. Similarly, x cannot be non-zero unless somehow y is non-zero "before" thread 1 runs, and that cannot happen. The general point here is that conditional writes that don't execute don't cause a data race.

This is a non-trivial fact of the memory model, though, because a compiler that is not aware of threading would be permitted (assuming y is not volatile) to transform the code if (x) y = 1; to int tmp = y; y = 1; if (!x) y = tmp;. Then there would be a data race. I can't imagine why it would want to do that exact transformation, but that doesn't matter, the point is that optimizers for non-threaded environments can do things that would violate the threaded memory model. So when Stroustrup says that every compiler he knows of gives the right answer (right under C++11's threading model, that is), that's a non-trivial statement about the readiness of those compilers for C++11 threading.

A more realistic transformation of if (x) y = 1 would be y = x ? 1 : y;. I believe that this would cause a data race in your example, and that there is no special treatment in the standard for the assignment y = y that makes it safe to execute unsequenced with respect to a read of y in another thread. You might find it hard to imagine hardware on which it doesn't work, and anyway I may be wrong, which is why I used a different example above that's less realistic but has a blatant data race.

Inquest answered 1/1, 2014 at 17:24 Comment(1)
For a CPU to be able to handle reasonably compiled Java code, it must guarantee that a conflicting access on memory address is well defined and doesn't produce random values, so if one thread does a=b (on an atomic memory location) in a loop and the other does b=a, no thread creates an original value. (OTOH, I'm not so sure that a = b==1?1:b cannot ever create value 1 "out of thin air" in some weird cases.)Garey
H
3

There has to be a total ordering of the writes, because of the fact that no thread can write to the variable x or y until some other thread has first written a 1 to either variable. In other words you have basically three different scenarios:

  1. thread 1 gets to write to y because x was written to at some previous point before the if statement, and then if thread 2 comes later, it writes to x the same value of 1, and doesn't change it's previous value of 1.
  2. thread 2 gets to write to x because y was changed at some point before the if statement, and then thread 1 will write to y if it comes later the same value of 1.
  3. If there are only two threads, then the if statements are jumped over because x and y remain 0.
Hamel answered 1/1, 2014 at 17:16 Comment(11)
Sorry, but this kind of reasoning is invalid. It doesn't work that way; in order to check if threaded code has Undefined Behavior as defined by the Standard Memory Model (aka, has a data race or some other problem), it never matters what you might think is cause and effect if there is no sequential consistency (there was a reason that was made the default; otherwise things are so counter intuitive that 99.9% of the programmer will get it wrong). Since the loads and stores in this program ARE default and therefore seq-cst, there is no race - but your particular reasoning is the not correct one.Hogweed
@CarloWood Can you please clarify? Which of the three scenarios I describe are incorrect? As I originally stated, there must be a total ordering to the writes, a write cannot spuriously be visible in a certain order in one thread, but visible in another order in a second thread. So when a thread comes to this code branch, either a previous write to x, to y, to both variables, or to neither variable is visible to the currently executing thread.Hamel
The first line at the top is an incorrect assumption (aka, "There has to be a total ordering of the writes, because of the fact that no thread can write to the variable x or y until some other thread has first written a 1 to either variable"). That seems logical, but is not how the memory model works. This comment space is too short to explain how exactly to reason about this though. Basically, you need to make a graph where the nodes are reads of and writes to the variables involved and the (directed) edges represent certain relationships. The memory model then describes which graph isHogweed
valid and which is invalid, meaning that a only graphs that are valid can happen in a situation where compiler+hardware guarantee the memory model. In the case that both reads are relaxed, the graph where they both read the write of 1 from the other graph is valid (if it will ever happen in practise is a different story, it certainly won't on x86 hardware for example). Here is the output of my 'memorymodel' program for this case: gyazo.com/f64a50706eb2aa4203f2254c716f35cc Hopefully that link will stay valid for a while ;). (made with github.com/CarloWood/memorymodel).Hogweed
You can also test it yourself on svr-pes20-cppmem.cl.cam.ac.uk/cppmem online, because this case is not particularly very demanding.Hogweed
@CarloWood Thanks for the links to the memory model tests. When I look at svr-pes20-cppmem.cl.cam.ac.uk/cppmem though, and set execution 1 of 8 to standard, I still feel that there is a "total order" to the writes in every scenario, in the sense that for a given processor core executing a given thread, the visibility of writes is ordered at a given execution point. When a write is visible to that thread/core, it remains visible for the duration of execution, its visibility isn't variable, and it's not partial for an atomic value. Thus you can write out an "order" for the writes.Hamel
There is no difference between 'standard' and 'preferred'. It's just that the latter is faster (and mathematically proved to be equivalent to the more formal formulation as given by the standard). Also, different threads can see writes to different memory locations in a different order, unless those writes are sequentially consistent. In our case they are (I saw no reason to change that), but you can still get the result x == y == 1 when both reads are relaxed. As is show here: gyazo.com/567aae14592e1b68a02ce03af4bbfd1d There are only 3 consistent executions.Hogweed
Consistent with the C++ memory model thus. Aka, only those three executions (or graphs) can happen. The first has x == y == 0, which is trivial-- the second is shown in the screenshot and has x == y == 1 as you can see, and the third is the same except that the orange sc (sequential consistent) arrow points in the other way; so that is the how both threads see the order of the writes. I understand that it is very counter intuitive that despite that known/fixed order in this graph the result is STILL x == y == 1, but that is how it is. You will have to look at Barry's thesis and study it forHogweed
a few weeks, as I did, before you can understand this is allowed by the standard: it definitely is counter intuitive. After writing my own 'memorymodel' program I came an insight that is usable here as follows: if you follow all sb (sequenced before) and rf (read-from) edges then there might be a consistency problem if you find a loop (after all, both edges contain a sense of "happens before", so a loop can be causality violation). However, if in that loop there are at least two different variables involved that are read relaxed, then that loop is not a causality violation. This is the caseHogweed
@CarloWood Can you share a link to the thesis you mentioned? I see what you’re saying about there not being a “total order” among all the threads combined, and I think that is where we’re talking past each other... yes, the “total order” is only for a given thread, two different threads may have a different visible ordering of writes.Hamel
cl.cam.ac.uk/~pes20/cpp/popl085ap-sewell.pdf I didn't read this one, but it's more recent and maybe covers much of the same(?): cl.cam.ac.uk/~kn307/2016/…Hogweed
I
2

Neither of the writes occurs, so there is no race. Both x and y remain zero.

(This is talking about the problem of phantom writes. Suppose one thread speculatively did the write before checking the condition, then attempted to correct things after. That would break the other thread, so it isn't allowed.)

Ingot answered 1/1, 2014 at 17:23 Comment(1)
"Neither of the writes occurs" How do you prove it?Garey
A
-3

Memory model set the supportable size of code and data areas.before comparing linking source code,we need to specify the memory model that is he can set the size limitsthe data and code.

Arita answered 13/9, 2015 at 2:57 Comment(1)
You badly misunderstood this question, it's about multi-threading.Openhearted

© 2022 - 2024 — McMap. All rights reserved.