C++ memory model and race conditions on char arrays
Asked Answered
F

2

6

Basically I have trouble understanding this: (from Bjarne FAQ)

However, most modern processors cannot read or write a single character, it must read or write a whole word, so the assignment to c really is ``read the word containing c, replace the c part, and write the word back again.'' Since the assignment to b is similar, there are plenty of opportunities for the two threads to clobber each other even though the threads do not (according to their source text) share data!

So how can char arrays exist without 3(7?) byte padding between elements?

Felly answered 11/11, 2013 at 9:52 Comment(1)
Another question about this paragraph, about the claim it makes about "modern hardware": Can modern x86 hardware not store a single byte to memory?. (TL:DR: whatever HW does internally, all ISAs with a byte-store instruction don't have any architecturally-visible effects on the surrounding bytes, so there's no software correctness issue. Early Alpha AXP is the lone "modern" ISA without byte load/store instructions, which is a problem for the C++11 memory model.)Sandisandidge
D
9

I think Bjarne is wrong about this, or at least, he's simplifying things considerably. Most modern processors are capable of writing a byte without reading a complete word first, or rather, they behave "as if" this were the case. In particular, if you have a char array[2];, and thread one only accesses array[0] and thread two only accesses array[1] (including when both threads are mutating the value), then you do not need any additional synchronization; this is guaranteed by the standard. If the hardware does not allow this directly, the compiler will have to add the synchronization itself.

It's very important to note the "as if", above. Modern hardware does access main memory by cache lines, not bytes. But it also has provisions for modifying single bytes in a cache line, so that when writing back, the processor core will not modify bytes that have not been modified in its cache.

Didst answered 11/11, 2013 at 10:32 Comment(0)
A
7

A platform that supports C++11 must be able to access storage of the size of one char without inventing writes. x86 does indeed have that ability. If a processor must modify 32 bits at once at any time, it must have a 32-bit wide char.

(Some background reasoning: arrays are stored contiguously, and chars have no padding (3.9.1).)

Adult answered 11/11, 2013 at 9:54 Comment(5)
@NoSenseEtAl: As long as he's able to think of enough other platforms... which a character of his description most certainly is :-)Adult
@NoSenseEtAl: For what it's worth, Herb Sutter makes this point quite clearly in the Atomic Weapons talks.Adult
@NoSenseEtAl: Also, I think the point is that the naive implementation of a read-modify-write invents spurious writes. But that's not to say that the architecture doesn't also support a more expensive, correct operation. In single-threaded mode, you would have no desire to pay such a price.Adult
regarding single threaded mode... afaik compiler cant know if it is single threaded or not, Hans explicitly mentioned that they lost a bit of performance by disallowing certain stuff like speculative writesFelly
@Felly There is a difference between what the Ram-bus does and the logical working of the CPU. And Stroustrup knows that (and expects this knowledge from his readers). x86-Assembler can easily access bytes. But the Ram-Interface does reads and writes whole 32 (or 64) Bit Words.Votary

© 2022 - 2024 — McMap. All rights reserved.