memory-barriers Questions
2
Consider the following code:
struct payload
{
std::atomic< int > value;
};
std::atomic< payload* > pointer( nullptr );
void thread_a()
{
payload* p = new payload();
p->value.st...
Marvamarve asked 20/6, 2015 at 7:57
3
Solved
If I lock a std::mutex will I always get a memory fence? I am unsure if it implies or enforces you to get the fence.
Update:
Found this reference following up on RMF's comments.
Multithreaded pr...
Lucia asked 23/6, 2012 at 20:56
2
Solved
Is memory outside each core always conceptually flat/uniform/synchronous in a multiprocessor system?
Multi processor systems perform "real" memory operations (those that influence definitive executions, not just speculative execution) out of order and asynchronously as waiting for global synchroni...
Headliner asked 23/5, 2019 at 4:48
2
Solved
I have an object of 64 byte in size:
typedef struct _object{
int value;
char pad[60];
} object;
in main I am initializing array of object:
volatile object * array;
int arr_size = 1000000;
arr...
Partridge asked 13/5, 2019 at 17:46
1
Solved
The Memory Order Machine Clear performance event is described by the vTune documentation as:
The memory ordering (MO) machine clear happens when a snoop request from another processor matches a ...
Curse asked 7/4, 2019 at 19:52
2
Solved
I'm checking how the compiler emits instructions for multi-core memory barriers on x86_64. The below code is the one I'm testing using gcc_x86_64_8.3.
std::atomic<bool> flag {false};
int any...
Proffer asked 18/3, 2019 at 23:42
1
I am trying to replace clock_gettime(CLOCK_REALTIME, &ts) with rdtsc to benchmark code execution time in terms of cpu cycles rather than server time. The execution time of the bench-marking cod...
Restoration asked 14/2, 2019 at 12:43
3
Solved
I was reading this question about using a bool for thread control and got intrigued by this answer by @eran:
Using volatile is enough only on single cores, where all threads use the same cache. On...
Childs asked 20/6, 2015 at 20:5
2
Solved
I have a thread that reads from a socket and generates data. After every operation, the thread checks a std::atomic_bool flag to see if it must exit early.
In order to cancel the operation, I set...
Debauchery asked 6/12, 2018 at 14:10
5
Solved
The Linux kernel uses lock; addl $0,0(%%esp) as write barrier, while the RE2 library uses xchgl (%0),%0 as write barrier. What's the difference and which is better?
Does x86 also require read barr...
Pyelography asked 20/11, 2010 at 12:15
1
Solved
ARM allows the reordering loads with subsequent stores, so that the following pseudocode:
// CPU 0 | // CPU 1
temp0 = x; | temp1 = y;
y = 1; | x = 1;
can result in temp0 == temp1 == 1 (and, this...
Numerator asked 7/9, 2018 at 3:53
2
Solved
I have read the doc of std::memory_order_relaxed.
One part of explanation of Relaxed ordering is ....
// Thread 1:
r1 = y.load(memory_order_relaxed); // A
x.store(r1, memory_order_relaxed); // B...
Novitiate asked 6/9, 2018 at 8:36
2
If a Core writes but the cache line is not present in its L1, so it writes to the Store Buffer. Another Core requests that cache line, MESI cannot see the Store Buffer update and returns the unmodi...
Effete asked 20/9, 2015 at 16:44
1
Solved
Executing the following is an atomic RMW operation
auto value = atomic.fetch_or(value, order);
When order is std::memory_order_acq_rel we know that the load of the previous value in the atomic ...
Succentor asked 23/8, 2018 at 4:57
1
Solved
As a follow-up to this topic, in order to calculate the memory miss latency, I have wrote the following code using _mm_clflush, __rdtsc and _mm_lfence (which is based on the code from this question...
Flamenco asked 22/8, 2018 at 9:32
1
Solved
I have already seen this answer and this answer, but neither appears to clear and explicit about the equivalence or non-equivalence of mfence and xchg under the assumption of no non-temporal instru...
Yancey asked 22/8, 2018 at 22:12
2
In recent Intel ISA documents the lfence instruction has been defined as serializing the instruction stream (preventing out-of-order execution across it). In particular, the description of the inst...
Cantrell asked 14/8, 2018 at 15:26
1
Solved
I was reading the Intel instruction set guide 64-ia-32 guide
to get an idea on memory fences. My question is that for an example with SFENCE, in order to make sure that all store operations are glo...
Postwar asked 12/8, 2018 at 13:5
1
Solved
It is my understanding that C# is a safe language and doesn't allow one to access unallocated memory, other than through the unsafe keyword. However, its memory model allows reordering when there i...
Kemp asked 4/7, 2018 at 21:12
1
Solved
I have been trying to Google my question but I honestly don't know how to succinctly state the question.
Suppose I have two threads in a multi-core Intel system. These threads are running on the s...
Headache asked 11/7, 2018 at 19:12
1
Solved
mov 0x0ff, 10
sfence
mov 0x0ff, 12
sfence
Can it executed by x86-CPU as:
mov 0x0ff, 12
sfence
?
Wurth asked 10/3, 2018 at 21:21
3
Solved
The Intel Architectures Software Developer's Manual, Aug. 2012, vol. 3A, sect. 8.2.2:
Any two stores are seen in a consistent order by processors other than
those performing the stores.
But c...
Valorous asked 9/1, 2013 at 4:23
2
Solved
As far as I know, a function call acts as a compiler barrier, but not as a CPU barrier.
This tutorial says the following:
acquiring a lock implies acquire semantics, while releasing a lock
imp...
Molt asked 20/6, 2018 at 14:47
1
Solved
I am going through the assembly generated by GCC for an ARM Cortex M4, and noticed that atomic_compare_exchange_weak gets two DMB instructions inserted around the condition (compiled with GCC 4.9 u...
Munger asked 11/6, 2018 at 14:29
4
Solved
I read the "Intel Optimization guide Guide For Intel Architecture".
However, I still have no idea about when should I use
_mm_sfence()
_mm_lfence()
_mm_mfence()
Could anyone explain when these...
Numidia asked 27/12, 2010 at 9:35
© 2022 - 2024 — McMap. All rights reserved.