It is certainly not allowed, since it changes, as you have noted, the observeable behavior (different output) of the program (I won't go into the hypothetical case that veryLongComputation()
might not consume any measurable time -- given the function's name, is presumably not the case. But even if that was the case, it wouldn't really matter). You wouldn't expect that it is allowable to reorder fopen
and fwrite
, would you.
Both t0
and t1
are used in outputting t1-t0
. Therefore, the initializer expressions for both t0
and t1
must be executed, and doing so must follow all standard rules. The result of the function is used, so it is not possible to optimize out the function call, though it doesn't directly depend on t1
or vice versa, so one might naively be inclined to think that it's legal to move it around, why not. Maybe after the initialization of t1
, which doesn't depend on the calculation?
Indirectly, however, the result of t1
does of course depend on side effects by veryLongComputation()
(notably the computation taking time, if nothing else), which is exactly one of the reasons that there exist such a thing as "sequence point".
There are three "end of expression" sequence points (plus three "end of function" and "end of initializer" SPs), and at every sequence point it is guaranteed that all side effects of previous evaluations will have been performed, and no side effects from subsequent evaluations have yet been performed.
There is no way you can keep this promise if you move around the three statements, since the possible side effects of all functions called are not known. The compiler is only allowed to optimize if it can guarantee that it will keep the promise up. It can't, since the library functions are opaque, their code isn't available (nor is the code within veryLongComputation
, necessarily known in that translation unit).
Compilers do however sometimes have "special knowledge" about library functions, such as some functions will not return or may return twice (think exit
or setjmp
).
However, since every non-empty, non-trivial function (and veryLongComputation
is quite non-trivial from its name) will consume time, a compiler having "special knowledge" about the otherwise opaque clock
library function would in fact have to be explicitly disallowed from reordering calls around this one, knowing that doing so not only may, but will affect the results.
Now the interesting question is why does the compiler do this anyway? I can think of two possibilities. Maybe your code triggers a "looks like benchmark" heuristic and the compiler is trying to cheat, who knows. It wouldn't be the first time (think SPEC2000/179.art, or SunSpider for two historic examples). The other possibility would be that somewhere inside veryLongComputation()
, you inadvertedly invoke undefined behavior. In that case, the compiler's behavior would even be legal.
g++ -O2 -S -fverbose-asm your-code.cc
with GCC...) – BurstoneveryLongComputation()
could very well be instantaneous. – TaelThe call to veryLongComputation() could very well be instantaneous
- I disagree. There is great number of algorithms, which given sufficiently large data are guaranteed not to complete before the end of solar system using any of computing hardware known at the time C++ standard was written. – Virtuosor
volatile. – Ahmedvolatile
was the solution suggested in the question he referenced. – LemmuelaveryLongComputation
returns a value, which is sent tocout
- I thought it qualified as an I/O. – Virtuosot1-t0
depends on code order and leave it alone. That's what gcc and clang does. VC++ does it for some functions and reorders for some others without obvious pattern. – Virtuosoclock
, it isn't because they think it would be wrong, it is simply that their optimization heuristics didn't find any particular reason to do it in this particular testcase, but they would happily do it on small variations. Benchmarking is not the main goal of the language, so it isn't surprising that it requires some extra work. – LukerBenchmarking is not the main goal of the language
I think that misses the point. It may be benchmarking in this case, it might be auto-tuning in another. Assuming benchmarking and then dismissing it as unimportant violates the principle of least surprise – Leviclock()
call. This would require a minor constraint on optimizer, easy to implement, as there are already constraints of this sort imposed for other reasons. This would remove violation of principle of least surprise. Better yet, make C++ abstract machine aware of non-zero time cost of non-elided computations. – Virtuosoclock()
can only do so much -- there are probably a half dozen functions just in Win32 to get the time in various formats and precisions...and the built-in inability to use one to write one's own equivalent ofclock()
would be pretty freaking surprising too. – Brashear#pragma optimize("", off)
allegedly disables all optimizations in the function(s) following it. You can reenable them with#pragma optimize("", on)
. – Brashear