Getting an accurate execution time in C++ (micro seconds)

Asked 18/2, 2014 at 13:57 Answered 2/3, 2018 at 9:35

Solved c++performance benchmarking timing microbenchmark

I want to get an accurate execution time in micro seconds of my program implemented with C++. I have tried to get the execution time with clock_t but it's not accurate.

(Note that micro-benchmarking is hard. An accurate timer is only a small part of what's necessary to get meaningful results for short timed regions. See Idiomatic way of performance evaluation? for some more general caveats)

Voight answered 18/2, 2014 at 13:57 Comment(4)

why do you think, its not accurate? – Eurystheus 18/2, 2014 at 13:59

Given that the execution time is subject to the charge on the CPU, the availability of memory, cache, all possible I/O, thread scheduler, etc. Are you sure you need that level of accuracy? – Discontented 18/2, 2014 at 14:2

@Voight If one of the answers here solved your problem please mark it as accepted. This way your question will stop showing up in the unanswered section. – Methedrine 18/2, 2014 at 18:41

I need an execution time with micro-seconds and clock_t afford an execution time with milliseconds.... – Voight 24/2, 2014 at 13:42

107

If you are using c++11 or later you could use std::chrono::high_resolution_clock.

A simple use case :

auto start = std::chrono::high_resolution_clock::now();
...
auto elapsed = std::chrono::high_resolution_clock::now() - start;

long long microseconds = std::chrono::duration_cast<std::chrono::microseconds>(
        elapsed).count();

This solution has the advantage of being portable.

Beware that micro-benchmarking is hard. It's very easy to measure the wrong thing (like your benchmark optimizing away), or to include page-faults in your timed region, or fail to account for CPU frequency idle vs. turbo.

See Idiomatic way of performance evaluation? for some general tips, e.g. sanity check by testing the other one first and see if that changes which one appears faster.

Methedrine answered 18/2, 2014 at 14:9 Comment(11)

Note: In g++ nanoseconds are available since GCC 4.8.1. – Anabolite 18/2, 2014 at 14:29

-1: this is good to time a function, but not to time a program, since you are missing the set-up and destruction part of the program by the OS. – Discontented 18/2, 2014 at 14:58

@Discontented to test the performance of a program on the scale of microseconds, the set-up and destruction part should be ignored. – Unpremeditated 18/2, 2014 at 15:22

@stefan: what makes you say so? reading the executable from disk to memory can be expensive. In addition, different OS have different set-up systems, can you confirm that all of them load the program in less than a microsecond?. It might all be irrelevant in the end, but I don’t think we should ignore them. – Discontented 18/2, 2014 at 15:55

@Discontented Exactly, different OS have different set-up times. That's why. You're not measuring the apps performance, but the OS performance, if you include that bit. Why would you do that? Unless you're an OS developer, you can't do anything about the set-up time. Even on one OS, the hard disk may need to spin up in order to read the executable. Do you really want to measure that to a microsecond precision? – Unpremeditated 18/2, 2014 at 15:58

@Unpremeditated hm, for instance, I could want to do that because I am a game developer and I want to know how long it takes for my game to start specifically on Windows 8, for instance. As a developer, I have an impact on how large the executable is: I can embed resources in the binary executable's sections, strip symbols, optimize for space, etc. I think the question needs to be clarified so as what exactly is to be measured. – Discontented 18/2, 2014 at 16:3

@Discontented Right, that is an application. But would you really do this by measuring just the time? You surely need a profiler to identify what part causes the high set-up time. The start-up of any program should "feel instantaneous". You don't really need to measure this imho. – Unpremeditated 18/2, 2014 at 16:21

btw, if I can't use auto, which data type should I use for std::chrono::high_resolution_clock::now()? – Henkel 25/8, 2014 at 3:33

I see. Answer myself. Should be: high_resolution_clock::time_point (example here: cplusplus.com/reference/chrono/high_resolution_clock/now ) - or std::chrono::high_resolution_clock::time_point in case of not using the namespace. – Henkel 25/8, 2014 at 3:37

Using eclipse in windows and I get "Function 'now' could not be resolved' on that first line: auto start = std::chrono::high_resolution_clock::now(); However, it works fine in linux. What's the deal? – Exodontist 1/11, 2016 at 23:51

It's basically always wrong to use std::high_resolution_clock. It is an alias for either std::steady_clock or std::system_clock, and they have very different characteristics, so you should make a conscious decision of which one is appropriate for you (almost certainly std::steady_clock to time how long something takes) and stick with that. – Darn 17/7, 2020 at 11:22

Here is how to get simple C-like millisecond, microsecond, and nanosecond timestamps in C++:

The new C++11 std::chrono library is one of the most complicated piles of ~~mess~~ C++ I have ever seen or tried to figure out how to use, but at least it is cross-platform!

So, if you'd like to simplify it down and make it more "C-like", including removing all of the type-safe class stuff it does, here are 3 simple and very easy-to-use functions to get timestamps in milliseconds, microseconds, and nanoseconds...that only took me about 12 hrs to write*:

NB: In the code below, you might consider using std::chrono::steady_clock instead of std::chrono::high_resolution_clock. Their definitions from here (https://en.cppreference.com/w/cpp/chrono) are as follows:

steady_clock (C++11) - monotonic clock that will never be adjusted

high_resolution_clock (C++11) - the clock with the shortest tick period available

#include <chrono>

// NB: ALL OF THESE 3 FUNCTIONS BELOW USE SIGNED VALUES INTERNALLY AND WILL
// EVENTUALLY OVERFLOW (AFTER 200+ YEARS OR SO), AFTER WHICH POINT THEY WILL
// HAVE *SIGNED OVERFLOW*, WHICH IS UNDEFINED BEHAVIOR (IE: A BUG) FOR C/C++.
// But...that's ok...this "bug" is designed into the C++11 specification, so
// whatever. Your machine won't run for 200 years anyway...

// Get time stamp in milliseconds.
uint64_t millis()
{
    uint64_t ms = std::chrono::duration_cast<std::chrono::milliseconds>(
            std::chrono::high_resolution_clock::now().time_since_epoch())
            .count();
    return ms; 
}

// Get time stamp in microseconds.
uint64_t micros()
{
    uint64_t us = std::chrono::duration_cast<std::chrono::microseconds>(
            std::chrono::high_resolution_clock::now().time_since_epoch())
            .count();
    return us; 
}

// Get time stamp in nanoseconds.
uint64_t nanos()
{
    uint64_t ns = std::chrono::duration_cast<std::chrono::nanoseconds>(
            std::chrono::high_resolution_clock::now().time_since_epoch())
            .count();
    return ns; 
}

* (Sorry, I've been more of an embedded developer than a standard computer programmer so all this high-level, abstracted static-member-within-class-within-namespace-within-namespace-within-namespace stuff confuses me. Don't worry, I'll get better.)

Q: Why `std::chrono`?

A: Because C++ programmers like to go crazy with things, so they made it handle units for you. Here are a few cases of some C++ weirdness and uses of std::chrono. Reference this cppreference community wiki page: https://en.cppreference.com/w/cpp/chrono/duration.

So you can declare a variable of 1 second and change it to microseconds with no cast like this:

// Create a time object of type `std::chrono::seconds` & initialize it to 1 sec
std::chrono::seconds time_sec(1); 
// integer scale conversion with no precision loss: no cast
std::cout << std::chrono::microseconds(time_sec).count() << " microseconds\n";

And you can even specify time like this, which is super weird and going way overboard in my opinion. C++14 has literally overloaded the characters ms, us, ns, etc. as function call operators to initialize std::chrono objects of various types like this:

auto time_sec = 1s; // <== notice the 's' inside the code there 
                    // to specify 's'econds!
// OR:
std::chrono::seconds time_sec = 1s;
// integer scale conversion with no precision loss: no cast
std::cout << std::chrono::microseconds(time_sec).count() << " microseconds\n";

Here are some more examples:

std::chrono::milliseconds time_ms = 1ms;
// OR:
auto time_ms = 1ms;

std::chrono::microseconds time_us = 1us;
// OR:
auto time_us = 1us;

std::chrono::nanoseconds time_ns = 1ns;
// OR:
auto time_ns = 1ns;

Personally, I'd much rather just simplify the language and do this, like I already do, and as has been done in both C and C++ prior to this for decades:

// Notice the `_sec` at the end of the variable name to remind me this 
// variable has units of *seconds*!
uint64_t time_sec = 1;

And here are a few references:

Clock types (https://en.cppreference.com/w/cpp/chrono):
1. system_clock
2. steady_clock
3. high_resolution_clock
4. utc_clock
5. tai_clock
6. gps_clock
7. file_clock
8. etc.
Getting an accurate execution time in C++ (micro seconds) (answer by @OlivierLi)
http://en.cppreference.com/w/cpp/chrono/time_point/time_since_epoch
http://en.cppreference.com/w/cpp/chrono/duration - shows types such as hours, minutes, seconds, milliseconds, etc
http://en.cppreference.com/w/cpp/chrono/system_clock/now

Video I need to watch still:

CppCon 2016: Howard Hinnant “A ＜chrono＞ Tutorial"

My 3 sets of timestamp functions (cross-linked to each other):
1. For C timestamps, see my answer here: Get a timestamp in C in microseconds?
2. For C++ high-resolution timestamps, see my answer here: Getting an accurate execution time in C++ (micro seconds)
3. For Python high-resolution timestamps, see my answer here: How can I get millisecond and microsecond-resolution timestamps in Python?
[my answer] Using operator ""s for std::chrono with gcc--Don't forget a using namespace declaration to get access to std::chrono::duration<> literals

ADDENDUM

More on "User-defined literals" (since C++11):

The operator"" mysuffix() operator overload/user-defined-literal/suffix function (as of C++11) is how the strange auto time_ms = 1ms; thing works above. Writing 1ms is actually a function call to function operator"" ms(), with a 1 passed in as the input parameter, as though you had written a function call like this: operator"" ms(1). To learn more about this concept, see the reference page here: cppreference.com: User-defined literals (since C++11).

Here is a basic demo to define a user-defined-literal/suffix function, and use it:

// 1. Define a function
// used as conversion from degrees (input param) to radians (returned output)
constexpr long double operator"" _deg(long double deg)
{
    long double radians = deg * 3.14159265358979323846264L / 180;
    return radians;
}

// 2. Use it
double x_rad = 90.0_deg;

Why not just use something more likedouble x_rad = degToRad(90.0); instead (as has been done in C and C++ for decades)? I don't know. It has something to do with the way C++ programmers think I guess. Maybe they're trying to make modern C++ more Pythonic.

This magic is also how the potentially very useful C++ fmt library works, here: https://github.com/fmtlib/fmt. It is written by Victor Zverovich, also the author of C++20's std::format. You can see the definition for the function detail::udl_formatter<char> operator"" _format(const char* s, size_t n) here. It's usage is like this:

"Hello {}"_format("World");

Output:

Hello World

This inserts the "World" string into the first string where {} is located. Here is another example:

"I have {} eggs and {} chickens."_format(num_eggs, num_chickens);

Sample output:

I have 29 eggs and 42 chickens.

This is very similar to the str.format string formatting in Python. Read the fmt lib documentation here.

Steels answered 2/3, 2018 at 9:35 Comment(7)

Years later, and even after using C++ professionally for many months now, I'm still absolutely mind-boggled by the unnecessary complexity of simple things like seconds, milliseconds, microseconds, and nanoseconds (among many other things), introduced by C++. – Steels 13/9, 2020 at 4:39

You can program using language idioms of X by using language Y, but it is probably going to be a mess. Substitute any two languages for X and Y and this is going to be true. In this case C and C++. Here's a good example of how to use <chrono> correctly: https://mcmap.net/q/21035/-gaffer-on-games-timestep-std-chrono-implementation The biggest characteristic that makes this a good example is not continually escaping the chrono type system into integral types. – Acidulous 13/9, 2020 at 13:33

Thanks for personally responding, Howard, I'll take a look. I'm studying many complicated aspects of C++ right now. I still have much to learn. However, I'd argue that a C developer could write better and cleaner "C" (let's call it "C+") using the C++ compiler than he/she could ever write with the C compiler. C++ has many wonderful additions, I just find most C++ developer's usage of it too complex, especially for embedded systems. I'd like to see the C++ compiler used a LOT more for embedded systems, but not with every modern C++ language feature and idiom. – Steels 13/9, 2020 at 15:47

At every turn, I'm finding C++ syntax distracting me from quickly and efficiently accomplishing the algorithmic and problem-solving task at hand. – Steels 13/9, 2020 at 15:51

Note that I've been using C++ professionally for ~45 to 65 hrs per week for 8 months now, with probably 12 to 30 hrs per week of that being studying the C++ language and syntax in order to understand, modify, or add to the code at hand. With C, I found myself having the time to study the embedded hardware, which is incredibly important to understand for an embedded engineer, & C architecture, far more, and having to study the C language and syntax far less. From that perspective, C++ has been a major time sink and distraction from just writing my code. Nevertheless, I will persevere. – Steels 13/9, 2020 at 16:7

Fwiw, I used to feel about git exactly as you do now about <chrono>. But after a friend helped me through the first hurdles, and quite frankly the addition of github.com, I now wouldn't use any other source control system. I can empathize with your position. – Acidulous 13/9, 2020 at 16:22

Thanks for this amazing answer. Here is a question: Why did you use high_resolution_clock? Is it always the most precise? It seems like people say it's not, such as this answer. I guess @HowardHinnant doesn't seem to recommend high_resolution_clock, either. So...what's the best way to measure time with the most precision in milliseconds or microseconds? – Lessor 23/7, 2022 at 14:58

If you are looking how much time is consumed in executing your program from Unix shell, make use of Linux time as below,

time ./a.out 

real    0m0.001s
user    0m0.000s
sys     0m0.000s

Secondly if you want time took in executing number of statements in the program code (C) try making use of gettimeofday() as below,

#include <sys/time.h>
struct timeval  tv1, tv2;
gettimeofday(&tv1, NULL);
/* Program code to execute here */
gettimeofday(&tv2, NULL);
printf("Time taken in execution = %f seconds\n",
     (double) (tv2.tv_usec - tv1.tv_usec) / 1000000 +
     (double) (tv2.tv_sec - tv1.tv_sec));

Zenobiazeolite answered 18/2, 2014 at 14:3 Comment(1)

Use clock_gettime() instead - The Man Pages say - The time returned by gettimeofday() is affected by discontinuous jumps in the system time (e.g., if the system administrator manually changes the system time). If you need a monotonically increasing clock, see clock_gettime(2). The Opengroup says - Applications should use the clock_gettime() function instead of the obsolescent gettimeofday() function. << – Vanderhoek 27/5, 2016 at 5:9

If you're on Windows, you can use QueryPerformanceCounter

See How to use the QueryPerformanceCounter function to time code in Visual C++

__int64 ctr1 = 0, ctr2 = 0, freq = 0;
int acc = 0, i = 0;

// Start timing the code.
if (QueryPerformanceCounter((LARGE_INTEGER *)&ctr1)!= 0)
{
    // Code segment is being timed.
    for (i=0; i<100; i++) acc++;

    // Finish timing the code.
    QueryPerformanceCounter((LARGE_INTEGER *)&ctr2);

    Console::WriteLine("Start Value: {0}",ctr1.ToString());
    Console::WriteLine("End Value: {0}",ctr2.ToString());

    QueryPerformanceFrequency((LARGE_INTEGER *)&freq);

    Console::WriteLine(S"QueryPerformanceCounter minimum resolution: 1/{0} Seconds.",freq.ToString());
    // In Visual Studio 2005, this line should be changed to:     Console::WriteLine("QueryPerformanceCounter minimum resolution: 1/{0} Seconds.",freq.ToString()); 
    Console::WriteLine("100 Increment time: {0} seconds.",((ctr2 - ctr1) * 1.0 / freq).ToString());
}
else
{
    DWORD dwError = GetLastError();
    Console::WriteLine(S"Error value = {0}",dwError.ToString());// In Visual Studio 2005, this line should be changed to: Console::WriteLine("Error value = {0}",dwError.ToString());
}

// Make the console window wait.
Console::WriteLine();
Console::Write("Press ENTER to finish.");
Console::Read();

return 0;

You can put it around a call to CreateProcess(...) and WaitForSingleObject(...) for the entire process lifetime, otherwise around the main function for your code.

Leucite answered 18/2, 2014 at 14:9 Comment(5)

But he wants the execution time of the program, not a function. This includes the set-up part that you can’t typically access. – Discontented 18/2, 2014 at 14:51

@qdii: He can put it around a call to CreateProcess(...) and WaitForSingleObject(...) for the entire process lifetime, otherwise around his main function. It's a reasonable solution, I don't think it really deserves downvotes though... – Leucite 18/2, 2014 at 15:22

+1, although being platform dependent. Why? Easy: on several compilers for this platform, either std::chrono::high_resolution_clock is not available or is set to a not-as-high-as-possible timer. This specific solution is necessary. – Unpremeditated 18/2, 2014 at 15:25

@Leucite good point, if you change your answer to include that I will turn my downvote into an upvote – Discontented 18/2, 2014 at 15:56

Boost.Chrono has a decent high resolution clock on Windows. – Farnese 18/2, 2014 at 22:54

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Here is how to get simple C-like millisecond, microsecond, and nanosecond timestamps in C++:

Q: Why `std::chrono`?

And here are a few references:

Video I need to watch still:

Related:

ADDENDUM

More on "User-defined literals" (since C++11):

Recommended topics

Hot tags

Here is how to get simple C-like millisecond, microsecond, and nanosecond timestamps in C++:

Q: Why std::chrono?

And here are a few references:

Video I need to watch still:

Related:

ADDENDUM

More on "User-defined literals" (since C++11):

Recommended topics

Hot tags

Q: Why `std::chrono`?