Why getSum does not get inlined by hotspot jvm?
Asked Answered
I

1

8

Here's the example I tried to reproduce from Java Performance: The Definitive Guide, Page 97 on the topic of Escape Analysis. This is probably what should happen:

  1. getSum() must get hot enough and with appropriate JVM parameters it must be inlined into the caller main().
  2. As both list and sum variables do not escape from main() method they could be marked as NoEscape hence JVM could use stack-allocation for them instead of heap-allocation.

But I ran it through jitwatch and the result showed that getSum() compiles into native-assembly and doesn't get inlined into main(). Not to mention consequently stack-allocation didn't happen either.

What am I doing wrong in here ? (I have put the whole code and hotspot log here.)

Here's the code:

import java.math.BigInteger;
import java.util.ArrayList;
import java.util.stream.IntStream;

public class EscapeAnalysisTest {

    private static class Sum {

        private BigInteger sum;
        private int n;

        Sum(int n) {
            this.n = n;
        }

        synchronized final BigInteger getSum() {
            if (sum == null) {
                sum = BigInteger.ZERO;
                for (int i = 0; i < n; i++) {
                    sum = sum.add(BigInteger.valueOf(i));
                }
            }
            return sum;
        }

    }

    public static void main(String[] args) {
        ArrayList<BigInteger> list = new ArrayList<>();
        for (int i = 1; i < 1000; i++) {
            Sum sum = new Sum(i);
            list.add(sum.getSum());
        }
        System.out.println(list.get(list.size() - 1));
    }

}

JVM parameters I used:

-server
-verbose:gc
-XX:+UnlockDiagnosticVMOptions
-XX:+TraceClassLoading
-XX:MaxInlineSize=60
-XX:+PrintAssembly
-XX:+LogCompilation
Instability answered 11/3, 2018 at 7:32 Comment(6)
The synchronized may be a reason.Momentous
@Momentous I tried to copy from the book, nevertheless I also tried without synchronized and still couldn't have it inlined.Instability
intersting, but I see an entry like this in your logs task_queued compile_id='37' compile_kind='osr', seems like osr is kicked inStork
yes @Eugene, getSum method gets compiled into native assembly. but I was expecting something else (I described in the question)Instability
Your loop is running in the main method, so it can only switch to an optimized version via OSR. This is known to have limitations.Artillery
@Artillery i was leaving this one to answer "later today"... :) too late I guessStork
G
5

In order to know why something is inlined or not, you can look in the compilation log for the inline_success and inline_fail tags.

However to even get something inlined, the caller would have to be compiled, in your case you want an inline in the main method so the only way this is going to happen is on-stack replacement (OSR). Looking at your log, you can see a few OSR compilations but none of the main method: there is simply not enough work in your main method.

You can fix that by increasing the number of iteration of your for loop. By increasing it to 100_000, I got a first OSR compilation.

For such a small example is looked at -XX:+PrintCompilation -XX:+PrintInlining rather than the whole LogCompilation output and I saw:

@ 27   EscapeAnalysisTest$Sum::getSum (51 bytes)   inlining prohibited by policy

That's not very helpful... but looking a bit a HotSpot source code reveals that it's probably because of a policy that prevents C1 compilations to inline methods that have been OSR compiled by C2. In any case, looking at the inlining done by the C1 compilation is not that interesting.

Adding more loop iterations (1_000_000 with a modulo on the argument to Sum to reduce run time) gets us a C2 OSR of main with:

@31   EscapeAnalysisTest$Sum::getSum (51 bytes)   already compiled into a big method  

That part of C2's policy is rather self-descriptive and is controlled by the InlineSmallCode flag: -XX:InlineSmallCode=4k tells HotSpot that the threshold for "big method" is at 4kB of native code. On my machine that was enough to get getSum inlined:

  14206   45 %     4       EscapeAnalysisTest::main @ 10 (61 bytes)
                              @ 25   EscapeAnalysisTest$Sum::<init> (10 bytes)   inline (hot)
                                @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
              s               @ 31   EscapeAnalysisTest$Sum::getSum (51 bytes)   inline (hot)
                                @ 31   java.math.BigInteger::valueOf (62 bytes)   inline (hot)
                                  @ 58   java.math.BigInteger::<init> (77 bytes)   inline (hot)
                                    @ 1   java.lang.Number::<init> (5 bytes)   inline (hot)
                                      @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
                                @ 34   java.math.BigInteger::add (123 bytes)   inline (hot)
                                  @ 41   java.math.BigInteger::add (215 bytes)   inline (hot)
                                  @ 48   java.math.BigInteger::<init> (38 bytes)   inline (hot)
                                    @ 1   java.lang.Number::<init> (5 bytes)   inline (hot)
                                      @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
                              @ 34   java.util.ArrayList::add (29 bytes)   inline (hot)
                                @ 7   java.util.ArrayList::ensureCapacityInternal (13 bytes)   inline (hot)
                                  @ 6   java.util.ArrayList::calculateCapacity (16 bytes)   inline (hot)
                                  @ 9   java.util.ArrayList::ensureExplicitCapacity (26 bytes)   inline (hot)
                                    @ 22   java.util.ArrayList::grow (45 bytes)   too big

(Note that i never had to use MaxInlineSize)

For reference here's the modified loop:

for (int i = 1; i < 1_000_000; i++) {
  Sum sum = new Sum(i % 10_000);
  list.add(sum.getSum());
}
Genro answered 12/3, 2018 at 11:18 Comment(1)
this one is interesting: inlining prohibited by policy I've seen the sources, so if JIT knows that C2 will use OSR it will not even inline it in C1, since there is no sense probably. I see this as yet another form of "made not entrant" in 3-td Tier, when the 4-th has already compiled this method into something "better"Stork

© 2022 - 2024 — McMap. All rights reserved.