It is known that on x86 for the operations load()
and store()
memory barriers memory_order_consume, memory_order_acquire, memory_order_release, memory_order_acq_rel
does not require a processor instructions for the cache and pipeline, and assembler's code always corresponds to std::memory_order_relaxed
, and these restrictions are necessary only for the optimization of the compiler: http://www.stdthread.co.uk/forum/index.php?topic=72.0
And this code Disassembly code confirms this for store()
(MSVS2012 x86_64):
std::atomic<int> a;
a.store(0, std::memory_order_relaxed);
000000013F931A0D mov dword ptr [a],0
a.store(1, std::memory_order_release);
000000013F931A15 mov dword ptr [a],1
But this code doesn't comfirm this for load()
(MSVS2012 x86_64), using lock cmpxchg
:
int val = a.load(std::memory_order_acquire);
000000013F931A1D prefetchw [a]
000000013F931A22 mov eax,dword ptr [a]
000000013F931A26 mov edx,eax
000000013F931A28 lock cmpxchg dword ptr [a],edx
000000013F931A2E jne main+36h (013F931A26h)
std::cout << val << "\n";
some_atomic.load(std::memory_order_acquire) does just drop through to a simple load instruction, and some_atomic.store(std::memory_order_release) drops through to a simple store instruction.
Where am I wrong, and does the semantics of std::memory_order_acquire
requires processor instructions on x86/x86_64 lock cmpxchg
or only a simple load instruction mov
as said Anthony Williams?
ANSWER: It is the same as this bug report: http://connect.microsoft.com/VisualStudio/feedback/details/770885
mov
. Really the developers of Microsoft have failed with this the simplest task: "do nothing"? :) – Lubumbashivolatile
- not because the C++ standard requires it, but because some bits of code that USED to work on single core processors suddenly work poorly if you use SMP systems. This looks similar to one of those situations. – Sacramentstd::memory_order
. And to avoid unnecessary calls to the WinAPI or assembler code, they decided to use the barriers(lock
) for volatile - these three solutions are equally not beautiful. But now with the new standard C++11 all are clearly defined and there is one elegant solution -mov
. Maybe for older x86 processors require to lock forload()
? – Lubumbashivolatile
being used insidestd::atomic
(I believevolatile
is required by the standard). – Sacramentstd::atomic
andvolatile
are very different things following the standard, its must uses in different cases, andstd::atomic
must not usevolatile
in its implementation. drdobbs.com/parallel/volatile-vs-volatile/212701484?pgno=1 – Lubumbashi