The unit of time used by the clock
function is arbitrary. On most platforms, it is unrelated to the processor speed. It's more commonly related to the frequency of an external timer interrupt — which may be configured in software — or to a historical value that's been kept for compatibility through years of processor evolution. You need to use the macro CLOCKS_PER_SEC
to convert to real time.
printf("Total time taken by CPU: %fs\n", (double)total_t / CLOCKS_PER_SEC);
The C standard library was designed to be implementable on a wide range of hardware, including processors that don't have an internal timer and rely on an external peripheral to tell the time. Many platforms have more precise ways to measure wall clock time than time
and more precise ways to measure CPU consumption than clock
. For example, on POSIX systems (e.g. Linux and other Unix-like systems), you can use getrusage
, which has microsecond precision.
struct timeval start, end;
struct rusage usage;
getrusage(RUSAGE_SELF, &usage);
start = usage.ru_utime;
…
getrusage(RUSAGE_SELF, &usage);
end = usage.ru_utime;
printf("Total time taken by CPU: %fs\n", (double)(end.tv_sec - start.tv_sec) + (end.tv_usec - start.tv_usec) / 1e-6);
Where available, clock_gettime(CLOCK_THREAD_CPUTIME_ID)
or clock_gettime(CLOCK_PROCESS_CPUTIME_ID)
may give better precision. It has nanosecond precision.
Note the difference between precision and accuracy: precision is the unit that the values are reported. Accuracy is how close the reported values are to the real values. Unless you are working on a real-time system, there are no hard guarantees as to how long a piece of code takes, including the invocation of the measurement functions themselves.
Some processors have cycle clocks that count processor cycles rather than wall clock time, but this gets very system-specific.
Whenever making benchmarks, beware that what you are measuring is the execution of this particular executable on this particular CPU in these particular circumstances, and the results may or may not generalize to other situations. For example, the empty loop in your question will be optimized away by most compilers unless you turn optimizations off. Measuring the speed of unoptimized code is usually pointless. Even if you add real work in the loop, beware of toy benchmarks: they often don't have the same performance characteristics as real-world code. On modern high-end CPUs such as found in PC and smartphones, benchmarks of CPU-intensive code is often very sensitive to cache effects and the results can depend on what else is running on the system, on the exact CPU model (due to different cache sizes and layouts), on the address at which the code happens to be loaded, etc.
total_t
in your case, withCLOCKS_PER_SEC
. Note that you need to casttotal_t
into a floating point value for it to work. – Indented_t
are usually used for type-aliases (as created withtypdef
). For examplesize_t
ortime_t
and evenclock_t
. – IndentedCLOCKS_PER_SEC
are different on different platforms is because the "ticks" are platform dependent. It depends not only on hardware, but also on the operating system. The resolution and precision is something you have to find out in your operating system documentation. The only way to reliably and portably get the number of seconds is to divide the (floating point) difference withCLOCKS_PER_SEC
. The numbers themselves are otherwise pretty meaningless. – Indentedclock
function or its internal workings. But depending on the operating system there might be higher-resolution timers available. And if you're on an embedded system with only a minimal operating system, then the hardware might have have timers you could use instead. – Indented