Microsecond resolution timestamps on Windows
Asked Answered
U

9

19

How do I get microsecond resolution timestamps on Windows?

I am loking for something better than QueryPerformanceCounter and QueryPerformanceFrequency (these can only give you an elapsed time since boot and are not necessarily accurate if they are called on different threads - that is, QueryPerformanceCounter may return different results on different CPUs. There are also some processors that adjust their frequency for power saving, which apparently isn't always reflected in their QueryPerformanceFrequency result.)

There is Implement a Continuously Updating, High-Resolution Time Provider for Windows, but it does not seem to be solid. When microseconds matter looks great, but it's not available for download any more.

Another resource is Obtaining Accurate Timestamps under Windows XP, but it requires a number of steps, running a helper program plus some init stuff also, I am not sure if it works on multiple CPUs.

I also looked at the Wikipedia article Time Stamp Counter which is interesting, but not that useful.

If the answer is just do this with BSD or Linux, it's a lot easier and that's fine, but I would like to confirm this and get some explanation as to why this is so hard in Windows and so easy in Linux and BSD. It's the same fine hardware...

Unaffected answered 10/3, 2010 at 3:39 Comment(2)
Is there an example of how it's easy to do in Linux or BSD?Morly
You know, if you read the QueryPeformanceCounter, it says explicitly that it does work when called from different threads, and is not affected by power saving. The only exception is buggy bios, drivers and/or hardware. Read the docs before dismissing an API ;)Submit
M
21

I believe this is still useful: System Internals: Guidelines For Providing Multimedia Timer Support.

It does a good job of explaining the various timers available and their limitations. It might be that your archenemy will not so much be resolution, but latency.

QueryPerformanceCounter will not always run at CPU speed. In fact, it might try to avoid RDTSC, especially on multi-processor(/multi-core) systems: it will use the HPET on Windows Vista and later if it is available or the ACPI/PM timer. On my system (Windows 7 x64, dual core AMD) the timer runs at 14.31818 MHz.

The same is true for earlier systems:

By default, Windows Server 2003 Service Pack 2 (SP2) uses the PM timer for all multiprocessor APIC or ACPI HALs, unless the check process to determine whether the BIOS supports the APIC or ACPI HALs fails."

The problem is, when the check fails. This simply means that your computer/BIOS is broken in a way. Then you might either fix your BIOS (recommended), or at least switch to using the ACPI timer (/usepmtimer) for the time being.

It is easy from C# - without P/Invoke - to check for high-resolution timer support with Stopwatch.IsHighResolution and then peek at Stopwatch.Frequency. It will make the necessary QueryPerformanceCounter call internally.

Also consider that if the timers are broken, the whole system will go havoc and in general, behave strangely, reporting negative elapsed times, slowing down, etc. - not just your application.

This means that you can actually rely on QueryPerformanceCounter.

... and contrary to popular belief, QueryPerformanceFrequency() "cannot change while the system is running".

Edit: As the documentation on QueryPerformanceCounter() states, "it should not matter which processor is called" - and in fact the whole hacking around with thread affinity is only needed if the APIC/ACPI detection fails and the system resorts to using the TSC. It is a resort that should not happen. If it happens on older systems, there is likely a BIOS update/driver fix from the manufacturer. If there is none, the /usepmtimer boot switch is still there. If that fails as well, because the system does not have a proper timer apart from the Pentium TSC, you might in fact consider messing with thread affinity - even then, the sample provided by others in the "Community Content" area of the page is misleading as it has a non-negligible overhead due to setting thread affinity on every start/stop call - that introduces considerable latency and likely diminishes the benefits of using a high resolution timer in the first place.

Game Timing and Multicore Processors is a recommendation on how to use them properly. Please consider that it is now five years old, and at that time fewer systems were fully ACPI compliant/supported - that is why while bashing it, the article goes into so much detail about TSC and how to work around its limitations by keeping an affine thread.

I believe it is a fairly hard task nowadays to find a common PC with zero ACPI support and no usable PM timer. The most common case is probably BIOS settings, when ACPI support is incorrectly set (sometimes sadly by factory defaults).

Anecdotes tell that eight years ago, the situation was different in rare cases. (Makes a fun read, developers working around design "shortcomings" and bashing chip designers. To be fair, it might be the same way vice versa. :-)

Mabel answered 16/3, 2010 at 20:36 Comment(6)
so "use QueryPerformanceCounter" ? wow, if only I'd thought of that... also, this still can return negative results even on my new core i7, so be wary. also, I don't think anyone is suggesting you would change the thread affinity just to call QueryPerformanceCounter and then change it back. just leave the thread set to one core permanently.Ballerina
@matt: I'm not saying that you - or anyone here in this thread - suggested changing thread affinity on each call. However the community content supplied right here, on this very page does just that: msdn.microsoft.com/en-us/library/ms644904%28VS.85%29.aspxMabel
@matt: so in fact, yes, use QPC - but make sure that it does not use RDTSC or you will end up with a nicely disguised API call to RDTSC.Mabel
If you need HPET, make sure settings in the BIOS are correct. One that I came across is "ACPI HPET Table => Enabled." The other is "HPET Support => Enabled" then "HPET Mode => 32bit/64bit", depending whether you run Vista(Win7) x86 or x64. HPET: blog.fpmurphy.com/2009/07/linux-hpet-support.html HPET is Vista and up only. PM (ACPI) timers can be used on older systems as well.Mabel
@andras: cool, I'll definitely check out the bios settings! I've certainly done QPC on at least one computer where the freq had nothing to do with the cpu clock, and was probably closer to the 14mhz mark. +1Ballerina
@matt: thanks. I'm eager to hear back from you. I've seen a few years old 8-way Xeon with the same problem. It was built before the then current revision of the ACPI spec finalized. Win2003 just couldn't decide what to use and fell back to the worst case (TSC). It was so much fun measuring the performance of our application on that... :-).Mabel
B
12

QueryPerformanceCounter / QueryPerformanceFrequency, processor speed resolution

Just be careful with multi-threaded. Each core on a processor can have its own counter.

Some more information is in Obtaining Accurate Timestamps under Windows XP.

If you do end up having to resort to this method:

When I was trying to manually write data to a serial port (for an infrared transmitter) I found that setting the process and thread priority to maximum (real time) greatly improved its reliability (as in no errors), this is something that had to have a resolution of around 40 kHz if I remember too, so it should remain accurate enough for millisecond resolution.

Ballerina answered 10/3, 2010 at 3:41 Comment(9)
These can only give you an elapsed time since boot, and are not necessarily accurate if they are called on different threads - ie QueryPerformanceCounter may return different results on different CPUs. There are also some processors that adjust their frequency for power saving, which apparently isn't always reflected in their QueryPerformanceFrequency result.Letdown
@Kibibu Exactly. I hinted that I needed something better than QueryPerformanceCounter QueryPerformanceFrequency in the question but did not elaborate why. Thanks for pointing this out.Unaffected
@kibibu, if that's important then you have make the code run on a particular processor. I forget what that's called....is it processor affinity?Gallstone
@kenny, yeah, processor affinity. this is the method I have used to do all my high precision stuff (such as profiling) anyway yeah, I should have read the msdn link first, it suggest this as a solution and then explains how it's not perfect. @Nikhil: I'm curios as to why you need this? theoretically all PCs have a system clock that maintains time when the processor is powered down. I don't know what the resolution of that would be, or how to get it.Ballerina
+1 for the correct reply. I know the correct reply is not always the wanted reply. QPC is the highest res timer available on a standard windows distro - unless you want to install custom hardware/drivers, the limits of QPC are what you need to work around.Counterpane
+1 - this is the way to do it with Windows. The counters will be accurate within the same threads, and if you need accuracy across threads you'll just have to use processor affinity. I know this is how most games calculate game frames.Morly
I've removed my downvote for this answer - I was sure there were better alternatives if you didn't need nanosecond accuracy, but milliseconds is as good as it gets for the various methods that return an actual time. Unfortunately I can't switch to an upvote now... Thread affinity is indeed the thing, just make sure you aren't calling this from across multiple threads without setting them all to the same processor. There are still (apparently) problems with some versions of SpeedStep and similar. As is usually the case, the right answer depends on the application.Letdown
@matt: it's been a number of years since CPUs have an HPET that is guaranteed to be consistent across any number of cores/threads. Now if Windows allows access to it is another matter, but they exist and they're not your grandpa's RTC/PIT. Thread as to how to read the HPET under Windows: #786824Residual
@WizardOfOdds that question is answered with 'use the RDTSC instruction'. I don't know what you are talking about but I have used this instruction my self, and I can tell you it is just as bad if not worse than using QueryPerformanceCounter (which uses this instruction it's self some times), this somewhat confirmed by the fact that the answer had no upvotes while a comment under it saying DO NOT USE has 2.Ballerina
G
6
  1. Windows is not a real-time OS.

  2. Processes on a multitasking OS will need to yield its time to another thread/process. This gives some overhead for timing.

  3. Every function call will have overhead thus giving a little delay when returning the request.

  4. Furthermore, the calling system call will need your process to switch from user space mode to kernel space mode, which has relatively high latency. You can overcome this by running the entire process in kernel mode (such as device driver code).

  5. Some OSes, like Linux or BSD, are better, but they still can not maintain accurate timing resolution to sub-microsecond (for example, the accuracy of nanosleep() on Linux is about 1 ms, not less than 1 ms), except you patch the kernel to some specific scheduler that give your application benefits.

So I think, it's better to adapt your application to follow those issues, such as by recalibrating your timing routine often, which is what your links provide. AFAIK, the highest timer resolution for Windows is still GetPerformanceCounter/Frequency() regardless of its accuracy. You can get better accuracy by running you timer pooling routine inside a separate thread, and set that thread affinity to one core processor, and set the thread priority the highest you can get.

Griddlecake answered 10/3, 2010 at 5:31 Comment(0)
H
6

I don't think you're going to find a solution better than QueryPerformanceCounter. The standard technique is to set up your code to catch and discard backward time jumps and massive outliers that might result from a thread switching CPUs. If you're measuring very small intervals (if not, then you don't need that precision), then it's not a common occurrence any way. Just make it a tolerable error rather than a critical error.

In the rare cases where you absolutely need to be sure that it never happens, then locking your threads down by setting the processor affinity mask is the only option.

Histogen answered 10/3, 2010 at 5:59 Comment(0)
P
6

QueryPerformanceCounter is the correct solution to this. Contrary to what you and some people answering you wrote, this call gives the correct answer even with multiprocessor systems (unless the system in question is broken), and it handles even changing CPU frequency. On most modern systems it is derived from RDTSC, but handling all those multi-CPU and frequency changing details for you. (It is significantly slower than RDTSC, though).

See QueryPerformanceCounter

On a multiprocessor computer, it should not matter which processor is called. However, you can get different results on different processors due to bugs in the basic input/output system (BIOS) or the hardware abstraction layer (HAL).

Penelopa answered 13/3, 2010 at 21:24 Comment(0)
B
3

There's a lot of good information in the answers so far.

If what you're looking for is a straightforward way to get elapsed time since January 1, 1970 at millisecond or better resolution on Windows XP or later, there's a very simple cross-platform example of this in the CurrentTime.cpp of Apple's OSS release of JavaScriptCore for MacOS 10.7.5 (I can't seem to find it in their 10.8+ releases). The code I'm referring to is in the CurrentTime() function.

It uses the standard technique of using QueryPerformanceCounter() to calculate elapsed time differences at higher-than-millisecond resolution, and then periodically synchronizing it to the system clock to calculate a timestamp and account for clock drift. In order to get the higher resolution timestamps it requires that you are running Windows XP or later so that calls to QueryPeformanceFrequency() are guaranteed to succeed.

It doesn't account for context switches throwing things off slightly (as "Implement a Continuously Updating, High-Resolution Time Provider for Windows" and "The Windows Timestamp Project" do), but it does continually re-synchronize. I wouldn't launch a rocket with it, but at around 50 lines of code it's simple to implement and good enough for many purposes.

Also, if you know that you are guaranteed to be running Windows 8 / Windows Server 2012, you should just use GetSystemTimePreciseAsFileTime(), since it returns the system date and time at the highest possible precision (1 microsecond or better).

Brogdon answered 23/5, 2014 at 15:3 Comment(1)
Also, starting with Visual Studio 2015, std::chrono on Windows will have sub-millisecond resolution.Brogdon
K
1

I've used the DateTimePrecise class from The Code Project.

The only problem I had with it is that it would give crazy results if I didn't call it at least every 10 seconds -- I think there was some sort of integer overflow internally -- so I have a timer which executes DateTimePrecise.Now every few seconds.

You should also run NTP on the machine if you want the times to be at all accurate.

Good luck...

Keelson answered 10/3, 2010 at 23:18 Comment(0)
P
1

I discovered difficulties using PerformanceCounter together with PerformanceCounterFrequency, because the given PerformanceCounterFrequency deviates from the actual frequency.

It deviates by an offset, and it also shows thermal drift. Newer hardware seems to have less drift, but the drift and the offset are quite considerable. A drift of a few ppm will already damage the microsecond accuracy to a large extend since 1 ppm is 1 µs/s! Therefore a careful hardware-specific calibration is strongly recommended when using PerformanceCounter with PerformanceCounterFrequency. This may also be the reason why "crazy results" are observed when not calling certain functions frequently.

I did some more detailed investigations on this matter. A description can be found in Microsecond Resolution Time Services for Windows.

Perplex answered 7/7, 2012 at 9:42 Comment(0)
A
0

Since C++11 there's a new <chrono> header so this task would be much more simpler. Just use std::chrono::high_resolution_clock, std::chrono::system_clock (wall clock), or std::chrono::steady_clock (monotonic clock) and you'll have a cross-platform solution

auto start1 = std::chrono::high_resolution_clock::now();
auto start2 = std::chrono::system_clock::now();
auto start3 = std::chrono::steady_clock::now();
// do some work
auto end1 = std::chrono::high_resolution_clock::now();
auto end2 = std::chrono::system_clock::now();
auto end3 = std::chrono::steady_clock::now();

std::chrono::duration<long long, std::micro> diff1 = end1 - start1;
std::chrono::duration<double, std::nano>     diff2 = end2 - start2;
auto diff3 = std::chrono::duration_cast<std::chrono::microseconds>(end3 - start3);

std::cout << diff.count() << ' ' << diff2.count() << ' ' << diff3.count() << '\n';

In C++17 and above (or C11 and above) there's another solution: std::timespec_get()

#include <iostream>
#include <ctime>
 
int main()
{
    std::timespec ts;
    std::timespec_get(&ts, TIME_UTC);
    char buf[100];
    std::strftime(buf, sizeof buf, "%D %T", std::gmtime(&ts.tv_sec));
    std::cout << "Current time: " << buf << '.' << ts.tv_nsec << " UTC\n";
}
Artieartifact answered 10/4, 2022 at 1:19 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.