I found a NON-GPL, but BSD source about reading the system's real time counter on ARM64 here: https://github.com/cloudius-systems/osv/blob/master/arch/aarch64/arm-clock.cc
Even better, the code there doesn't only deliver some sort of ticks as rdtsc()
does on Intel/AMD, it even reports the frequency of those ticks. And yes, it's in sync on all cores on multicores. So it can be useful for many things, including benchmarking or keeping track of the threads in a thread pool etc. Of course, it won't have the long term stability of a system clock, that is synced to an external time source via ntp
.
The class arm_clock
defined in the cited code might be overkill for many purposes. It for example also shows how to set hardware timers, which is something, that a normal user mode process likely won't have the permissions to do. Here is an excerpt of the most important parts to just read TSC and frequencies. It compiles fine with recent GCC on Intel, AMD and ARM. Of course, the frequency reading is provided only on ARM:
#ifdef __ARM_ARCH_ISA_A64
// Adapted from: https://github.com/cloudius-systems/osv/blob/master/arch/aarch64/arm-clock.cc
uint64_t rdtsc() {
//Please note we read CNTVCT cpu system register which provides
//the accross-system consistent value of the virtual system counter.
uint64_t cntvct;
asm volatile ("mrs %0, cntvct_el0; " : "=r"(cntvct) :: "memory");
return cntvct;
}
uint64_t rdtsc_barrier() {
uint64_t cntvct;
asm volatile ("isb; mrs %0, cntvct_el0; isb; " : "=r"(cntvct) :: "memory");
return cntvct;
}
uint32_t rdtsc_freq() {
uint32_t freq_hz;
asm volatile ("mrs %0, cntfrq_el0; isb; " : "=r"(freq_hz) :: "memory");
return freq_hz;
}
#else
#include <x86intrin.h>
uint64_t rdtsc(){ return __rdtsc(); }
#endif
Execution times tested by me were in the range of 7 ns for rdtsc()
and 30 ns for rdtsc_barrier()
, which are quite similar to Intel and AMD.
rdtsc
on multi-core processors can have issues. see msdn.microsoft.com/en-us/library/ee417693(VS.85).aspx – Demissionrdtsc()
is very reliable on all modern CPUs. Even on multisocket systems, that have a few years old CPUs, I get nearly identical values forrdtsc()
over all the cores. Only very old systems, that don't haveconstant_tsc
andnonstop_tsc()
in their capabilities, have those issues mentioned in the microsoft document. – Tricia