C++11 clocks: g++ steady_clock::is_steady == false?
Asked Answered
W

3

10

So accurate timing is important to me, and I was investigating the 3 types of clocks specified in C++ 11, namely system_clock, steady_clock, and high_resolution_clock. My initial concern was testing if there is any difference in call overhead to the different types of clocks, and to check the resolution of each type of clock. Here is my sample program:

#include <chrono>
#include <cstdio>
using namespace std;
using namespace std::chrono;

int main(int argc, char **argv)
{
  size_t N = 1e6;
  if(2 == argc) {
    sscanf(argv[1], "%zu", &N);
  }

#if defined(hrc)
  typedef high_resolution_clock clock;
#warning "High resolution clock"
#elif defined(sc)
  typedef steady_clock clock;
#warning "Steady clock"
#elif defined(sys)
  typedef system_clock clock;
#warning "System clock"
#endif

  const double resolution = double(clock::period::num) / double(clock::period::den);

  printf("clock::period: %lf us.\n", resolution*1e6);
  printf("clock::is_steady: %s\n", clock::is_steady ? "yes" : "no");
  printf("Calling clock::now() %zu times...\n", N);

  // first, warm up
  for(size_t i=0; i<100; ++i) {
    time_point<clock> t = clock::now();
  }

  // loop N times
  time_point<clock> start = clock::now();
  for(size_t i=0; i<N; ++i) {
    time_point<clock> t = clock::now();
  }
  time_point<clock> end = clock::now();

  // display duration
  duration<double> time_span = duration_cast<duration<double>>(end-start);
  const double sec = time_span.count();
  const double ns_it = sec*1e9/N;
  printf("That took %lf seconds. That's %lf ns/iteration.\n", sec, ns_it);

  return 0;
}

I compile it with

$ g++-4.7 -std=c++11 -Dhrc chrono.cpp -o hrc_chrono
chrono.cpp:15:2: warning: #warning "High resolution clock" [-Wcpp]
$ g++-4.7 -std=c++11 -Dsys chrono.cpp -o sys_chrono
chrono.cpp:15:2: warning: #warning "System clock" [-Wcpp]
$ g++-4.7 -std=c++11 -Dsc  chrono.cpp -o sc_chrono
chrono.cpp:15:2: warning: #warning "Steady clock" [-Wcpp]

I compiled with G++ 4.7.2, and ran it on

  • SUSE Linux, kernel v3.1.10, CPU i7
  • Angstrom Linux embedded system, kernel v3.1.10, MCU Tegra 2 (ARM Cortex A9).

The first surprising thing was that the 3 types of clock are apparently synonyms. They all have the same period (1 micro sec), and the time/call is practically the same. What's the point of specifying 3 types of clocks if they are all the same? Is this just because the G++ implementation of chrono isn't mature yet? Or maybe the 3.1.10 kernel only has one user-accessible clock?

The second surprise, and this is huge, is that steady_clock::is_steady == false. I'm fairly certain that by definition, that property should be true. What gives?? How can I work around it (ie, achieve a steady clock)?

If you can run the simple program on other platforms/compilers, I would be very interested to know the results. If anybody is wondering, it's about 25 ns/iteration on my Core i7, and 1000 ns/iteration on the Tegra 2.

Wrac answered 22/2, 2013 at 20:13 Comment(2)
Umm, yeah. I'm just compiling the code 3 times, once for each type of clock, which is specified by the -DXXX flag. The last argument to g++ is just the executable's filename, which doesn't matter a lick. (although I have it reflect both which type of clock and the fact that the program is exercising the chrono library.)Wrac
Sorry, I completely misread the command lineJaneenjanek
A
10

steady_clock is supported for GCC 4.7 (as shown by the docs for the 4.7 release: http://gcc.gnu.org/onlinedocs/gcc-4.7.2/libstdc++/manual/manual/status.html#status.iso.2011) and steady_clock::is_steady is true but only if you build GCC with --enable-libstdcxx-time=rt

See https://mcmap.net/q/364812/-what-is-_glibcxx_use_nanosleep-all-about for details of that configuration option.

For GCC 4.9 it will be enabled automatically if your OS and C library supports POSIX monotonic clocks for clock_gettime (which is true for GNU/Linux with glibc 2.17 or later and for Solaris 10, IIRC)

Here are the results with GCC 4.8 configured with --enable-libstdcxx-time=rt on an AMD Phenom II X4 905e, 2.5GHz but I think it's throttled to 800MHz right now, running Linux 3.6.11, glibc 2.15

$ ./hrc
clock::period: 0.001000 us.
clock::is_steady: no
Calling clock::now() 1000000 times...
That took 0.069646 seconds. That's 69.645928 ns/iteration.
$ ./sys
clock::period: 0.001000 us.
clock::is_steady: no
Calling clock::now() 1000000 times...
That took 0.062535 seconds. That's 62.534986 ns/iteration.
$ ./sc
clock::period: 0.001000 us.
clock::is_steady: yes
Calling clock::now() 1000000 times...
That took 0.065684 seconds. That's 65.683730 ns/iteration.

And with GCC 4.7 without --enable-libstdcxx-time (so the same results for all three clock types) on ARMv7 Exynos5 running Linux 3.4.0, glibc 2.16

clock::period: 1.000000 us.
clock::is_steady: no
Calling clock::now() 1000000 times...
That took 1.089904 seconds. That's 1089.904000 ns/iteration.
Amphibolous answered 22/2, 2013 at 21:58 Comment(2)
I see in an archived email of yours (gcc.gnu.org/ml/libstdc++/2012-05/msg00085.html) that "To get maximum clock resolution on GNU/Linux it's still necessary to use --enable-libstdcxx-time=rt, causing a performance hit in single-threaded code that uses libstdc++." Can you specify what you mean (ie, what operations will have a performance hit?) and how you arrived at the conclusion (ie, did you profile?)?Wrac
See the first paragraph of that mail: The reason is that some or all of those calls are defined in librt, but on GNU/Linux if libstdc++.so links to librt.so then it also links to libpthread.so and so __gthread_active_p() will always return true, causing additional locking in single-threaded apps. The reference counting in libstdc++ will use atomic ops or mutexes in programs that use multiple threads, as determined by whether the program links to libpthread or not.Amphibolous
C
7

If you can run the simple program on other platforms/compilers, I would be very interested to know the results.

Mac OS X 10.8, clang++ / libc++, -O3, 2.8 GHz Core i5:

High resolution clock

clock::period: 0.001000 us.
clock::is_steady: yes
Calling clock::now() 1000000 times...
That took 0.021833 seconds. That's 21.832827 ns/iteration.

System clock

clock::period: 1.000000 us.
clock::is_steady: no
Calling clock::now() 1000000 times...
That took 0.041930 seconds. That's 41.930000 ns/iteration.

Steady clock

clock::period: 0.001000 us.
clock::is_steady: yes
Calling clock::now() 1000000 times...
That took 0.021478 seconds. That's 21.477953 ns/iteration.

steady_clock and system_clock are required to be distinct types. steady_clock::is_steady is required to be true. high_resolution_clock may be a distinct type or an alias of steady_clock or system_clock. system_clock::rep must be a signed type.

Carry answered 22/2, 2013 at 20:23 Comment(0)
M
4

According to GNU's site, GNU libstdc++ doesn't support steady_clock yet. That's why steady_clock::is_steady is false.

Here is the relevant section of the support checklist:

20.11.7.1   Class system_clock           Y   
20.11.7.2   Class steady_clock           N   Support old monotonic_clock spec instead
20.11.7.3   Class high_resolution_clock  Y   
Merylmes answered 22/2, 2013 at 20:34 Comment(3)
Ah, ok I suspected something like that. At least it's still monotonic if I'm reading that right.Wrac
Those docs are out of date, steady_clock is supported for GCC 4.7 but only if you build GCC with --enable-libstdcxx-timeAmphibolous
The comment doesn't say it's monotonic, it says the class has the old name monotonic_clock from earlier C++0x drafts ... in fact that's not true for GCC 4.7 and later, the docs are months out of date, the class is called steady_clock but is_steady is only true when --enable-libstdcxx-time is usedAmphibolous

© 2022 - 2024 — McMap. All rights reserved.