Can multiple threads write the same value to the same variable at the same time safely?
Asked Answered
J

2

9

Can multiple threads write the same value to the same variable at the same time safely?

For a specific example — is the below code guaranteed by the C++ standard to compile, run without undefined behavior and print "true", on every conforming system?

#include <cstdio>
#include <thread>

int main()
{
    bool x = false;
    std::thread one{[&]{ x = true; }};
    std::thread two{[&]{ x = true; }};
    one.join();
    two.join();
    std::printf(x ? "true" : "false");
}

This is a theoretical question; I want to know whether it definitely always works rather than whether it works in practice (or whether writing code like this is a good idea :)). I'd appreciate if someone could point to the relevant part of the standard. In my experience it always works in practice, but not knowing whether or not it's guaranteed to work I always use std::atomic instead - I'd like to know whether that's strictly necessary for this specific case.

Jemina answered 9/1, 2020 at 1:11 Comment(11)
The std fails to define MT programs. End of story.Dogfight
@Dogfight What are you talking about? It has done, strongly and strictly, for almost a decade.Limitless
@LightnessRacesBY-SA3.0 Wrong. There is no explanation as to how the semantics of non threaded programs is extended to MT. So not only MT programs are not defined, single thread programs are not defined either.Dogfight
Comment police at it yet again. @Dogfight that makes no sense.Limitless
Somewhat of a wonky example: There are only two possible values for a bool variable. A more interesting case would be to use a double variable, and have each of the two threads store a different value. Then you could ask whether the final result was guaranteed to be one of the two values that the two threads stored, or whether it possibly could be the initial value, or whether it possibly could be some other value altogether.Preen
@LightnessRacesBY-SA3.0 No police. Just the fact that there is nothing in the std that defines the behavior of a MT program. No explanation of how atomics behave, what executes in sequence, what undefined behavior means, etc. No nothing. A complete sham. And people are happy with that emptiness and prefer to look elsewhere, as usual.Dogfight
@SolomonSlow Not only that. It could result in a non value: a value that is strongly different with itself (unrelated, not the result of rounding) when examined.Dogfight
@SolomonSlow That's an example that's more clearly wrong; I wanted to ask about the actual case I was unsure about.Jemina
@Karu, Re, "more clearly wrong," Exactly! If you want an example of something that's wrong, would you rather have one that clearly is wrong? or would you prefer to have one that is every bit as wrong, but the wrongness is less obvious? All of the same things that could go wrong with the float example could also go wrong with the bool example, but you have to think harder about the bool example because even when it goes wrong, it still has a good chance of giving the right answer for the wrong reason.Preen
@SolomonSlow All MT programs are "somewhat" wrong as there is no formal basis for MT programming.Dogfight
@SolomonSlow I didn't just want an example of something that's wrong, I was actually asking a question that I didn't know the answer to.Jemina
L
16

No.

You need to synchronize access to those variables, either by using mutexes or by making them atomic.

There is no exemption for when the same value is being written. You don't know what steps are involved in writing that value (which is the underlying practical concern), and neither does the standard which is why code has undefined behaviour … which means your compiler can just make absolute mayhem with your program (and that's the real issue you need to avoid).

Someone's going to come along and tell you that such-and-such an architecture guarantees atomic writes to these sized variables. But that doesn't change the UB aspect.

The passages you're looking for are:

[intro.races/2]: Two expression evaluations conflict if one of them modifies a memory location ([intro.memory]) and the other one reads or modifies the same memory location.

[intro.races/21]: […] The execution of a program contains a data race if it contains two potentially concurrent conflicting actions, […]. Any such data race results in undefined behavior.

… and the surrounding wording. That section is actually quite esoteric, but you don't really need to parse it as this is a classic, textbook data race that you can read about in any book on programming.

Limitless answered 9/1, 2020 at 1:12 Comment(6)
For clarity(?), it's the assignments in threads one and two themselves which are in conflict. The assignment in main and the read at the end of main are not in conflict with the assignments in threads one and two, right?Deandeana
@JeffGarrett RightLimitless
It isn't just the "steps", which is a low level issue. There is the higher level issue that the compiler will assume that writing only happens when no other thread can use the object.Dogfight
That's literally what this says, @curiousguy. But, hang on, I thought the standard didn't specify anything about multi-threaded programs...?Limitless
@LightnessRacesBY-SA3.0 The std says a lot of stuff about not being allowed to have a data race. The std has clauses. But there is nothing that defines even one C++ program. Because nothing is defined to be either sequential or not. Claiming that we have an MT semantics is a hoax (and you felt for it). There is nothing that allows anyone to reason about programs because there is sound basis. At most you could claim that programs that have atomics and mutexes that are never created not destroyed are defined... and that's pulling hairs. But you could claim that.Dogfight
Hi @Lightness Races in Orbit, can we conclude that for ANY VARIABLE (of any type) that can be UPDATED by SEVERAL THREADS AT ONCE we should use locks so that only one thread writes at a time and thus achieve consistency?Highspirited
B
5

Lightness is correct and spot-on from a standards perspective.

But I'll give you another perspective why this is not a good idea from a hardware architecture perspective.

Without a memory barrier (atomic, mutex, etc...), you can encounter what's known as the cache coherency problem. On a multi-core or multi-processor machine, your two threads could both set x to true, but your main thread potentially could print false even if your compiler didn't stash x into a register. That's because the hardware cache used by the main thread hasn't been updated to have x invalidated from whatever cache line its on yet. The atomic types and lock guards provided by C++ (along with countless OS primitives) are implemented to solve this issue.

In any case, google for Cache Coherence Problem and Cache Coherence Multicore. And for a particular architecture implementation of how atomic transactions are implemented, look up the Intel LOCK prefix.

Burchette answered 9/1, 2020 at 1:46 Comment(6)
Just to be clear, my answer is not just about the standards but about practical, real effects of those rules. UB isn't just a theoretical concern: the resulting symptoms are real, and anything else/lower than that can generally be considered moot as a result. But it's nice to hear additional ways in which the architecture can trip you up in these cases, if you manage to get past the UB :)Limitless
I don't think this answer is correct. This answer seems to be suggesting that even if you got rid of one of the threads writing to x (and thereby eliminated the UB), the main thread could nonetheless fail to see the result of the write. But I don't think that's true: the completion of a thread synchronizes with the corresponding return from std::thread::join [link], which means that after the join call returns, the main thread should observe all writes from the writer thread.Selaginella
Not fail to see the result of the write per se. It's just that that change in the value of x might not be seen in the main immediately after the worker thread sets it. It might take additional clock cycles. But you are likely correct, the join call is a memory barrier to effectively synchronize the main thread.Burchette
@Selaginella "the main thread should observe all writes from the writer thread" all the latest writes that couldn't be clobbered by yet another threadDogfight
@curiousguy: The question includes the complete program; there is no "yet another thread".Selaginella
@Selaginella Yes indeed but it wasn't clear (to me) if your comment was specific to that example or general.Dogfight

© 2022 - 2024 — McMap. All rights reserved.