Consider this example:
#include <iostream>
#include <atomic>
#include <thread>
struct SpinLock{
std::atomic<bool> state;
void lock(){
bool expected = false;
while(!state.compare_exchange_strong(expected,true,std::memory_order::acquire,std::memory_order::relaxed)){
expected = false;
}
}
void unlock(){
state.store(false,std::memory_order::release);
}
};
int main(){
auto spin_lock = SpinLock{false};
int i = 0;
std::thread t1([&](){
std::this_thread::sleep_for(std::chrono::seconds(1));
spin_lock.lock();
auto time_stamp = std::time(nullptr);
std::cout<<"t1 " <<time_stamp <<" "<< i<<"\n"; // #1
spin_lock.unlock();
});
std::thread t2([&](){
spin_lock.lock();
i = 1;
auto time_stamp = std::time(nullptr);
std::cout<<"t2 " <<time_stamp<<" "<< i<<"\n"; // #2
spin_lock.unlock();
});
t1.join();
t2.join();
}
From the perspective of C++ standard, Is it possible that #1
prints i==0
and a timestamp value representing a later time while #2
prints i==1
and a timestamp value representing an earlier time?
If the value of i
#1
reads is 0
, that is, the store operations to state
in t1
is earlier in the modification order than that of t2(i.e. the CAS operation in t1
wins the race so it firstly acquires the lock), otherwise the read value must be 1
since the lock
in t1
would synchronize with the unlock
in t2. The modification order is irrelevant to the order in the timeline, IIUC, the outcome is
t1 1729172229 0
t2 1729172228 1
this outcome is unintuitive, however, from the perspective of C++ standard, is it a possible outcome?
std::time
it is an archaic function and might behave dubiously. – Sanguineoustime()
function so it won't actually overflow until 2038 even on implementations wheretime_t
is a 32-bit integer. (man7.org/linux/man-pages/man3/time.3p.html). It is a pretty coarse time, though, only 1-second resolution, bad for noticing reordering effects unless they happen to overlap a transition. – Bunnicompare_exchange_weak
would be enough in that context where we have a loop anyway. – Goodlysteady_clock
or any other clock that is steady. So if the ISA can implementsteady_clock
by reading directly from system registers, etc, then it needs to insert appropriate synchronization barriers to ensure correctness. – Showdownrdtsc
can reorder with memory operations even if the memory operations haveacquire
orrelease
, in ways that a load couldn't. Such a clock could still be steady in practice, although the OP didn't specify that requirement. On x86 for example,rdtsc
andrdtscp
have no inputs so their clock-reading uops can execute as soon as there's a free execution port. All mainstream CPUs schedule uops oldest-ready first, so if there's only one port that can run the actual clock-read uop, the older one will exec first. – Bunnisteady_clock
monotonic, it respects happens-before across threads. – Bunni