OpenMP atomic and non-atomic reads/writes produce the same instructions on x86_64

According to the OpenMP Specification (v4.0), the following program contains a possible data race due to unsynchronized read/write of i:

int i{0}; // std::atomic<int> i{0};

void write() {
// #pragma omp atomic write // seq_cst
   i = 1;
}

int read() {
   int j;
// #pragma omp atomic read // seq_cst
   j = i; 
   return j;
}

int main() {
   #pragma omp parallel
   { /* code that calls both write() and read() */ }
}

Possible solutions that came to my mind are shown in the code as comments:

to protect write and read of i with #pragma omp atomic write/read,
to protect write and read of i with #pragma omp atomic write/read seq_cst,
to use std::atomic<int> instead of int as a type of i.

Here are the compilers-generated instructions on x86_64 (with -O2 in all cases):

GNU g++ 4.9.2:               i = 1;        j = i;
original code:               MOV           MOV
#pragma omp atomic:          MOV           MOV
// #pragma omp atomic seq_cst:  MOV           MOV
#pragma omp atomic seq_cst:  MOV+MFENCE    MOV    (see UPDATE)
std::atomic<int>:            MOV+MFENCE    MOV

clang++ 3.5.0:               i = 1;        j = i;
original code:               MOV           MOV
#pragma omp atomic:          MOV           MOV
#pragma omp atomic seq_cst:  MOV           MOV
std::atomic<int>:            XCHG          MOV

Intel icpc 16.0.1:           i = 1;        j = i;
original code:               MOV           MOV
#pragma omp atomic:          *             *
#pragma omp atomic seq_cst:  *             *
std::atomic<int>:            XCHG          MOV

* Multiple instructions with calls to __kmpc_atomic_xxx functions.

What I wonder is why the GNU/clang compiler does not generate any special instructions for #pragma omp atomic writes. I would expect similar instructions as for std::atomic, i.e, either MOV+MFENCE or XCHG. Any explanation?

UPDATE

g++ 5.3.0 produces MFENCE for #pragma omp atomic write seq_cst. That is the correct behavior, I believe. Without seq_cst, it produces plain MOV, which is sufficient for non-SC atomicity.

There was a bug in my Makefile, g++ 4.9.2 produces MFENCE for CS atomic write as well. Sorry guys for that.

Clang 3.5.0 does not implement the OpenMP SC atomics, thanks Hristo Iliev for pointing this out.

There are two possibilities.

The compiler is not obligated to convert C++ code containing a data race into bad machine code. Depending on the machine memory model, the instructions normally used may already be atomic and coherent. Take that same C++ code to another architecture and you may start seeing the pragmas cause differences that didn't exist on x86_64.
In addition to potentially causing use of different instructions and/or extra memory fence instructions, the atomic pragmas (as well std::atomic and volatile) also constrain the compiler's own code reordering optimizations. They may not apply to your simply case, but you certainly could see that common-subexpression elimination, including hoisting computations outside a loop, may be affected.

Recommended topics

Hot tags