JIT not optimizing loop that involves Integer.MAX_VALUE
Asked Answered
G

1

49

While writing an answer to another question, I noticed a strange border case for JIT optimization.

The following program is not a "Microbenchmark" and not intended to reliably measure an execution time (as pointed out in the answers to the other question). It is solely intended as an MCVE to reproduce the issue:

class MissedLoopOptimization
{
    public static void main(String args[])
    {
        for (int j=0; j<3; j++)
        {
            for (int i=0; i<5; i++)
            {
                long before = System.nanoTime();
                runWithMaxValue();
                long after = System.nanoTime();
                System.out.println("With MAX_VALUE   : "+(after-before)/1e6);
            }
            for (int i=0; i<5; i++)
            {
                long before = System.nanoTime();
                runWithMaxValueMinusOne();
                long after = System.nanoTime();
                System.out.println("With MAX_VALUE-1 : "+(after-before)/1e6);
            }
        }
    }

    private static void runWithMaxValue()
    {
        final int n = Integer.MAX_VALUE;
        int i = 0;
        while (i++ < n) {}
    }

    private static void runWithMaxValueMinusOne()
    {
        final int n = Integer.MAX_VALUE-1;
        int i = 0;
        while (i++ < n) {}
    }
}

It basically runs the same loop, while (i++ < n){}, where the limit n is once set to Integer.MAX_VALUE, and once to Integer.MAX_VALUE-1.

When executing this on Win7/64 with JDK 1.7.0_21 and

java -server MissedLoopOptimization

the timing results are as follows:

...
With MAX_VALUE   : 1285.227081
With MAX_VALUE   : 1274.36311
With MAX_VALUE   : 1282.992203
With MAX_VALUE   : 1292.88246
With MAX_VALUE   : 1280.788994
With MAX_VALUE-1 : 6.96E-4
With MAX_VALUE-1 : 3.48E-4
With MAX_VALUE-1 : 0.0
With MAX_VALUE-1 : 0.0
With MAX_VALUE-1 : 3.48E-4

Obviously, for the case of MAX_VALUE-1, the JIT does what one could expect: It detects that the loop is useless, and completely eliminates it. However, it does not remove the loop when it is running up to MAX_VALUE.

This observation is confirmed by a look at the JIT assembly output when starting with

java -server -XX:+UnlockDiagnosticVMOptions -XX:+TraceClassLoading -XX:+LogCompilation -XX:+PrintAssembly MissedLoopOptimization

The log contains the following assembly for the method that runs up to MAX_VALUE:

Decoding compiled method 0x000000000254fa10:
Code:
[Entry Point]
[Verified Entry Point]
[Constants]
  # {method} &apos;runWithMaxValue&apos; &apos;()V&apos; in &apos;MissedLoopOptimization&apos;
  #           [sp+0x20]  (sp of caller)
  0x000000000254fb40: sub    $0x18,%rsp
  0x000000000254fb47: mov    %rbp,0x10(%rsp)    ;*synchronization entry
                                                ; - MissedLoopOptimization::runWithMaxValue@-1 (line 29)
  0x000000000254fb4c: mov    $0x1,%r11d
  0x000000000254fb52: jmp    0x000000000254fb63
  0x000000000254fb54: nopl   0x0(%rax,%rax,1)
  0x000000000254fb5c: data32 data32 xchg %ax,%ax
  0x000000000254fb60: inc    %r11d              ; OopMap{off=35}
                                                ;*goto
                                                ; - MissedLoopOptimization::runWithMaxValue@11 (line 30)
  0x000000000254fb63: test   %eax,-0x241fb69(%rip)        # 0x0000000000130000
                                                ;*goto
                                                ; - MissedLoopOptimization::runWithMaxValue@11 (line 30)
                                                ;   {poll}
  0x000000000254fb69: cmp    $0x7fffffff,%r11d
  0x000000000254fb70: jl     0x000000000254fb60  ;*if_icmpge
                                                ; - MissedLoopOptimization::runWithMaxValue@8 (line 30)
  0x000000000254fb72: add    $0x10,%rsp
  0x000000000254fb76: pop    %rbp
  0x000000000254fb77: test   %eax,-0x241fb7d(%rip)        # 0x0000000000130000
                                                ;   {poll_return}
  0x000000000254fb7d: retq   
  0x000000000254fb7e: hlt    
  0x000000000254fb7f: hlt    
[Exception Handler]
[Stub Code]
  0x000000000254fb80: jmpq   0x000000000254e820  ;   {no_reloc}
[Deopt Handler Code]
  0x000000000254fb85: callq  0x000000000254fb8a
  0x000000000254fb8a: subq   $0x5,(%rsp)
  0x000000000254fb8f: jmpq   0x0000000002528d00  ;   {runtime_call}
  0x000000000254fb94: hlt    
  0x000000000254fb95: hlt    
  0x000000000254fb96: hlt    
  0x000000000254fb97: hlt    

One can clearly see the loop, with the comparison to 0x7fffffff and the jump back to inc. In contrast to that, the assembly for the case where it is running up to MAX_VALUE-1:

Decoding compiled method 0x000000000254f650:
Code:
[Entry Point]
[Verified Entry Point]
[Constants]
  # {method} &apos;runWithMaxValueMinusOne&apos; &apos;()V&apos; in &apos;MissedLoopOptimization&apos;
  #           [sp+0x20]  (sp of caller)
  0x000000000254f780: sub    $0x18,%rsp
  0x000000000254f787: mov    %rbp,0x10(%rsp)    ;*synchronization entry
                                                ; - MissedLoopOptimization::runWithMaxValueMinusOne@-1 (line 36)
  0x000000000254f78c: add    $0x10,%rsp
  0x000000000254f790: pop    %rbp
  0x000000000254f791: test   %eax,-0x241f797(%rip)        # 0x0000000000130000
                                                ;   {poll_return}
  0x000000000254f797: retq   
  0x000000000254f798: hlt    
  0x000000000254f799: hlt    
  0x000000000254f79a: hlt    
  0x000000000254f79b: hlt    
  0x000000000254f79c: hlt    
  0x000000000254f79d: hlt    
  0x000000000254f79e: hlt    
  0x000000000254f79f: hlt    
[Exception Handler]
[Stub Code]
  0x000000000254f7a0: jmpq   0x000000000254e820  ;   {no_reloc}
[Deopt Handler Code]
  0x000000000254f7a5: callq  0x000000000254f7aa
  0x000000000254f7aa: subq   $0x5,(%rsp)
  0x000000000254f7af: jmpq   0x0000000002528d00  ;   {runtime_call}
  0x000000000254f7b4: hlt    
  0x000000000254f7b5: hlt    
  0x000000000254f7b6: hlt    
  0x000000000254f7b7: hlt    

So my question is: What is so special about the Integer.MAX_VALUE that prevents the JIT from optimizing it in the same way as it does for Integer.MAX_VALUE-1? My guess would be that has to do with the cmp instruction, which is intended for signed arithmetic, but that alone is not really a convincing reason. Can anybody explain this, and maybe even give a pointer to the OpenJDK HotSpot code where this case is treated?

(An aside: I hope that the answer will also explain the different behavior between i++ and ++i that was asked for in the other question, assuming that the reason for the missing optimization is (obviously) actually caused by the Integer.MAX_VALUE loop limit)

Giesser answered 15/8, 2014 at 12:26 Comment(2)
I wonder what would be the output if you do while (++i <= n) {}Manifold
@Manifold Acutally, I tried different configurations, like i++<n, ++i<n, i++<=n, and n<=++i etc, but did not systematically compare and document all possible configurations in order to keep the actual question clear.Giesser
E
37

I have not dug up the Java Language Specification, but I'd guess that it has to do with this difference:

  • i++ < (Integer.MAX_VALUE - 1) never overflows. Once i reaches Integer.MAX_VALUE - 1 it is incremented to Integer.MAX_VALUE and then the loop terminates.

  • i++ < Integer.MAX_VALUE contains an integer overflow. Once i reaches Integer.MAX_VALUE, it is incremented by one causing an overflow and then the loop terminates.

I assume that the JIT compiler is "reluctant" to optimize-out loops with such corner conditions - there was a whole bunch of bugs w.r.t. loop optimization in integer overflow conditions, so that reluctance is probably quite warranted.

There may also be some hard requirement that does not allow integer overflows to be optimized-out, although I somehow doubt that since integer overflows are not directly detectable or otherwise handled in Java.

Enrichment answered 15/8, 2014 at 12:32 Comment(7)
I can't find the reference right now, but the JLS does have something akin to C++'s "as-if" rule so the JIT would be in its right to do the optimization (there's no way in Java to notice whether an operation overflowed such as checking flags). But obviously allowing optimizations when overflow is possible opens a big can of worms so yes that's my guess too.Martie
The JLS will probably not contain such an information, because it is mainly an issue of the JIT. The idea of preventing an overflow is in line with my assumption that it might be related to the signed cmp instruction, but still wonder why this seems to cause such a completely different behavior.Giesser
@Marco My guess is that the overflow causes some warning flags during constant propagation. HotSpot uses the following algorithm but I've never looked into the details.Martie
@Voo: there was a whole bunch of bugs w.r.t. integer overflows some time ago...Enrichment
Even if a loop does nothing, the optimizer has to be careful about removing as it must prove that the loop is not infinite before eliminating it. That’s obviously much harder when the variable involved in the loop condition can overflow.Brewington
Although it is not "THE" reason for the behavior (in terms of a line in the OpenJDK code with a comment like // Do not optimize to avoid overflow) I think that this is the answer. The link to the "bunch of bugs" convinced me that the compiler has to be tremendously careful in such border cases. It might be possible to back-track "THE" reason by going through all bug reports and search for the corresponding fixes in the HotSpot code, but considering that this is indeed a border case, one can simply assume that it is treated as such, because otherwise one of the errors could show up again.Giesser
Based on your answer, I have skimmed the OpenJDK change logs for the bug report, and I think I found the commit where this border case treatment (and the "bunch of bugs") was tackled: hg.openjdk.java.net/jdk7/jdk7/hotspot/rev/bad7ecd0b6ed (I have not read this completely .... it's quite a lot of code and changes... but it seems to be the right one). If you want to add this to your answer, I'll delete this comment. Otherwise, it may just remain here as a pointer for those who are interested.Giesser

© 2022 - 2024 — McMap. All rights reserved.