The fact that what you see is the result of some JIT optimization should be clear by now looking at all the comments you received. But what is really happening and why that code is optimized always nearly after the same amount of iterations of the outer for
?
I'll try to answer both questions but please remember that everything explained here is relative only to Oracle's Hotspot VM. There is no Java specification that defines how a JVM JIT should behave.
First of all, let's see what the JIT is doing running that Test program with some additional flag (the plain JVM is enough to run this, no need to load the debugging shared library, required for some of the UnlockDiagnosticVMOptions
options):
java -XX:+PrintCompilation Test
The execution completes with this output (removing a few lines at the beginning that show that other methods are being compiled):
[...]
195017
184573
184342
184262
183491
189494
131 51% 3 Test::getTimes @ 2 (22 bytes)
245167
132 52 3 Test::getTimes (22 bytes)
165144
65090
132 53 1 java.nio.Buffer::limit (5 bytes)
59427
132 54% 4 Test::getTimes @ 2 (22 bytes)
75137
48110
135 51% 3 Test::getTimes @ -2 (22 bytes) made not entrant
142 55 4 Test::getTimes (22 bytes)
150820
86951
90012
91421
The printlns
from your code are interleaved with diagnostic information related to the compilation the JIT is performing.
Looking at a single line:
131 51% 3 Test::getTimes @ 2 (22 bytes)
Each column has the following meaning:
- Timestamp
- Compilation Id (with additional attributes if needed)
- Tiered compilation level
- Method short name (with @
osr_bci
if available)
- Compiled method size
Keeping only the lines related to getTimes
:
131 51% 3 Test::getTimes @ 2 (22 bytes)
132 52 3 Test::getTimes (22 bytes)
132 54% 4 Test::getTimes @ 2 (22 bytes)
135 51% 3 Test::getTimes @ -2 (22 bytes) made not entrant
142 55 4 Test::getTimes (22 bytes)
It's clear that getTimes
is being compiled more than once, but every time it's compiled in a different way.
That %
symbol means that on-stack replacement(OSR) has been performed, meaning that the 10k loop contained in getTimes
have been compiled isolated from the rest of the method and that the JVM replaced that section of the method code with the compiled version. The osr_bci
is an index that points to this new compiled block of code.
The next compilation is a classic JIT compilation that compiles all the getTimes
method (the size is still the same because there is nothing else in that method other than the loop).
The third time another OSR is performed but at a different tiered level. Tiered compilation have been added in Java7 and basically allows the JVM to choose the client or server JIT mode at runtime, switching freely between the two when necessary. The Client mode performs a simpler set of optimization strategies while the server mode is able to apply more sophisticated optimizations that on the other hand have a bigger cost in term of time spent compiling.
I will not go into details about the different modes or about tiered compilation, if you need additional information i recommend Java Performance: The Definitive Guide by Scott Oaks and also check this question that explain what changes between the levels.
Back to the output of PrintCompilation, the gist here is that from a certain point in time, a sequence of compilations with increasing complexity is performed until the method becomes apparently stable (i.e. the JIT doesn't compile it again).
So, why all this start at that certain point in time, after 5-10 iteration of the main loop?
Because the inner getTimes
loop has become "hot".
The Hotspot VM, usually defines "hot" those methods that have been invoked at least 10k times (that's the historical default threshold, can be changed using -XX:CompileThreshold=<num>
, with tiered compilation there are now multiple thresholds) but in the case of OSR i'm guessing that it's performed when a block of code is deemed "hot" enough, in term of absolute or relative execution time, inside the method contains it.
Additional References
PrintCompilation Guide by Krystal Mok
Java Performance: The Definitive Guide
d
for anything, so the whole loop could be simply removed by the JIT compiler as dead code. See also avoiding benchmarking pitfalls. – ChrissychristgetTimes
since it's not having any lasting affet – Ainsleyfor (int i=0; i<5000; i++)
, and start it withjava -server YourProgram
, to see how your "code" seems to be executed in 977 nanoseconds (on windows, at least). (This is the time that light takes to travel 300 meters, by the way) – Backslide