What JVM optimization is causing these performance results? [closed]
Asked Answered
A

1

8

In a Java REST service performance test, I got an unexpected pattern: a method that creates and returns always the same value object in each invocation runs faster than another version that just returns the value object stored in a class or object field.

Code:

@POST @Path("inline") public Response inline(String s) { 
    return Response.status(Status.CREATED).build(); 
}    

private static final Response RESP = Response.status(Status.CREATED).build();
@POST @Path("staticfield") public Response static(String s) { 
    return RESP; 
}

private final Response resp = Response.status(Status.CREATED).build();
@POST @Path("field") public Response field(String s) { 
    return resp; 
}

Byte code:

  • Inline (faster): getstatic, invokestatic, invokevirtual, areturn
  • Static filed (slower): getstatic, areturn
  • Object field (slower): aload, getfield, areturn

Performance (using Apache AB, single thread, several runs with consistent results):

  • Inline: 17078.29 [#/sec] (mean)
  • Static field: 5242.64 [#/sec] (mean)
  • Object field: 5417.40 [#/sec] (mean)

Environment: RHEL6 + JDK Oracle 1.7.0_60-b19 64bits

Is is possible that the JVM optimized the inline version with native code, but never considered optimizing the other two because they are already pretty small?

Abdullah answered 25/7, 2014 at 16:58 Comment(9)
I think it's most likely that something's not working the way you think it is, outside of the above code.Defect
"Returns always the same value object"... Perhaps the REST layer knows that the result can be cached then?Moan
Post a complete, compilable benchmark. Only then can we dig into what's going on.Nib
@ThorbjørnRavnAndersen I've verified that the response is not cached. The REST framework (which I have excluded to simplify) should not interfere because the class expose exactly the same functional behavior in the three methods. The only difference is the implementation, which is visible only to the JVM.Abdullah
@Nib thanks for jumping in. Can you elaborate on this? what would you like to have in addition? I'm trying to keep the question as simple as possible (because I presume this is due a JVM optimization), but I'd be happy to add anything suggested.Abdullah
@HotLicks I understand what you are saying, however anything outside the above code should see the three methods exactly in the same way (they have the same signature and exactly functional behavior), being the reason I conclude whatever is impacting these numbers is having access to the implementation. In fact, if I switch the implementation but keep everything else the same, the numbers change accordingly.Abdullah
You've asked for an explanation. The most plausible one is that something outside of the above code accounts for the difference. Eg, if for every iteration a new instance was created that would easily explain the Object case. And weird stuff with "unsafe" (which is apt to be present in many server environments) could easily account for both cases.Defect
Could you try to get a listing of JIT-compiled asm for the above methods using one of the techniques mentioned here? #1503979Tourist
@Abdullah make a complete example others can download and run. Either you find out why you see what you do in the process of trimming or others can have a closer look at your trimmed example. If you do not do that, your question will most likely be closed.Moan
B
4

As pointed out in the comments, it is difficult to tell without actually looking at the assembly. As yoy are using a REST-framework, I assume however that is would be hard to tell from the assembly as there is quite a lot of code to read.

Instead, I want to give you an educated guess because your code is an archetypical example of applying costant folding. When a value is inlined and not read from a field, the JVM can safely assume that this value is constant. When JIT compiling the method, the constant expression can therefore be safely merged with your framework code what probably leads to less JIT assebly and therefore improved performance. For a field value, even a final one, a constant value cannot be assumed as the field value can change. (As long as the field value is not a compile time constant, a primitive or a constant String, which are inlined by javac.) The JVM can therefore probably not constant fold the value.

You can read more on constant folding in the tutorial to the JMH where it is noted:

If JVM realizes the result of the computation is the same no matter what, it can cleverly optimize it. In our case, that means we can move the computation outside of the internal JMH loop. This can be prevented by always reading the inputs from the state, computing the result based on that state, and the follow the rules to prevent DCE.

I hope you used such a framework. Otherwise, you performance metric is unlikely to be valid.

From reading the byte code, you can generally not learn much about runtime performance as the JIT compiler can tweak the byte code to anything during optimization. The byte code layout should only matter when code is interpreted which is generally not the state where one would measure performance as performance-critical, hot code is always JIT compiled.

Burnett answered 26/7, 2014 at 10:18 Comment(1)
This is the best answer so far. I've repeated the performance tests with no frameworks at all, and the inline method is slower (the opposite results). So, whatever optimization is happening, it happens only when the framework code is active. As you say, it difficult to be sure that this is what is really happening in this case, but your explanation fits the context nicely and I learned something new today thanks to your response. Thanks!Abdullah

© 2022 - 2024 — McMap. All rights reserved.