newInstance vs new in jdk-9/jdk-8 and jmh

Asked 14/3, 2017 at 12:55 Answered 15/3, 2017 at 0:50

Solved java performance java-8 jmh java-9

I've seen a lot of threads here that compare and try to answer which is faster: newInstance or new operator.

Looking at the source code, it would seem that newInstance should be much slower, I mean it does so many security checks and uses reflection. And I've decided to measure, first running jdk-8. Here is the code using jmh.

@BenchmarkMode(value = { Mode.AverageTime, Mode.SingleShotTime })
@Warmup(iterations = 5, time = 2, timeUnit = TimeUnit.SECONDS)   
@Measurement(iterations = 5, time = 2, timeUnit = TimeUnit.SECONDS)    
@State(Scope.Benchmark) 
public class TestNewObject {
    public static void main(String[] args) throws RunnerException {

        Options opt = new OptionsBuilder().include(TestNewObject.class.getSimpleName()).build();
        new Runner(opt).run();
    }

    @Fork(1)
    @Benchmark
    public Something newOperator() {
       return new Something();
    }

    @SuppressWarnings("deprecation")
    @Fork(1)
    @Benchmark
    public Something newInstance() throws InstantiationException, IllegalAccessException {
         return Something.class.newInstance();
    }

    static class Something {

    } 
}

I don't think there are big surprises here (JIT does a lot of optimizations that make this difference not that big):

Benchmark                  Mode  Cnt      Score      Error  Units
TestNewObject.newInstance  avgt    5      7.762 ±    0.745  ns/op
TestNewObject.newOperator  avgt    5      4.714 ±    1.480  ns/op
TestNewObject.newInstance    ss    5  10666.200 ± 4261.855  ns/op
TestNewObject.newOperator    ss    5   1522.800 ± 2558.524  ns/op

The difference for the hot code would be around 2x and much worse for single shot time.

Now I switch to jdk-9 (build 157 in case it matters) and run the same code. And the results:

 Benchmark                  Mode  Cnt      Score      Error  Units
 TestNewObject.newInstance  avgt    5    314.307 ±   55.054  ns/op
 TestNewObject.newOperator  avgt    5      4.602 ±    1.084  ns/op
 TestNewObject.newInstance    ss    5  10798.400 ± 5090.458  ns/op
 TestNewObject.newOperator    ss    5   3269.800 ± 4545.827  ns/op

That's a whooping 50x difference in hot code. I'm using latest jmh version (1.19.SNAPSHOT).

After adding one more method to the test:

@Fork(1)
@Benchmark
public Something newInstanceJDK9() throws Exception {
    return Something.class.getDeclaredConstructor().newInstance();
}

Here are the overall results n jdk-9:

TestNewObject.newInstance      avgt    5    308.342 ±   107.563  ns/op
TestNewObject.newInstanceJDK9  avgt    5     50.659 ±     7.964  ns/op
TestNewObject.newOperator      avgt    5      4.554 ±     0.616  ns/op

Can someone shed some light on why there is such a big difference?

Gigantism answered 14/3, 2017 at 12:55 Comment(11)

Are you using a JDK9 build with jigsaw? – Submaxillary 14/3, 2017 at 13:5

@JornVernee not sure it matters, but the answer is no - I am not. – Gigantism 14/3, 2017 at 13:6

It would matter, since there would be a number of additional access checks for the module the system, that the JIT might not know how to handle nicely yet. – Submaxillary 14/3, 2017 at 13:8

@JornVernee almost everything jigsaw-wise is in the non-jigsaw build anyway as far as I know. But yes, that is the only thing that come to my mind also for the slower part in jdk-9. – Gigantism 14/3, 2017 at 13:10

Class.newInstance() is deprecated in Java 9. The performance of the recommended alternative, clazz.getDeclaredConstructor().newInstance() would be interesting… – Holp 14/3, 2017 at 15:5

@Holp good point, added. The diff is still 10x, better, but far from 2x... – Gigantism 14/3, 2017 at 15:11

Can you make another test, now with a non-public (default access) constructor in Something? – Holp 14/3, 2017 at 15:13

@Holp just did - the results are almost the same as with the public constructor. – Gigantism 14/3, 2017 at 15:16

You could cache the Constructor instance in a static final field instead of using Something.class.getDeclaredConstructor().newInstance(). You could also try a methodhandle for the constructor and caching that. handles move the access checks to creation time instead of call time, so they should be much faster – Verbify 14/3, 2017 at 17:35

There is no point in speculating about the differences unless you have perfasm profiles for both benchmarks. – Jevons 14/3, 2017 at 20:28

Once you do that profiling exercise, try with -XX:-TieredCompilation ;) – Jevons 14/3, 2017 at 20:46

First of all, the problem has nothing to do with the module system (directly).

I noticed that even with JDK 9 the first warmup iteration of newInstance was as fast as with JDK 8.

# Fork: 1 of 1
# Warmup Iteration   1: 10,578 ns/op    <-- Fast!
# Warmup Iteration   2: 246,426 ns/op
# Warmup Iteration   3: 242,347 ns/op

This means something has broken in JIT compilation.
-XX:+PrintCompilation confirmed that the benchmark was recompiled after the first iteration:

10,762 ns/op
# Warmup Iteration   2:    1541  689   !   3       java.lang.Class::newInstance (160 bytes)   made not entrant
   1548  692 %     4       bench.generated.NewInstance_newInstance_jmhTest::newInstance_avgt_jmhStub @ 13 (56 bytes)
   1552  693       4       bench.generated.NewInstance_newInstance_jmhTest::newInstance_avgt_jmhStub (56 bytes)
   1555  662       3       bench.generated.NewInstance_newInstance_jmhTest::newInstance_avgt_jmhStub (56 bytes)   made not entrant
248,023 ns/op

Then -XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining pointed to the inlining problem:

1577  667 %     4       bench.generated.NewInstance_newInstance_jmhTest::newInstance_avgt_jmhStub @ 13 (56 bytes)
                           @ 17   bench.NewInstance::newInstance (6 bytes)   inline (hot)
            !                @ 2   java.lang.Class::newInstance (160 bytes)   already compiled into a big method

"already compiled into a big method" message means that the compiler has failed to inline Class.newInstance call because the compiled size of the callee is larger than InlineSmallCode value (which is 2000 by default).

When I reran the benchmark with -XX:InlineSmallCode=2500, it became fast again.

Benchmark                Mode  Cnt  Score   Error  Units
NewInstance.newInstance  avgt    5  8,847 ± 0,080  ns/op
NewInstance.operatorNew  avgt    5  5,042 ± 0,177  ns/op

You know, JDK 9 now has G1 as the default GC. If I fall back to Parallel GC, the benchmark will also be fast even with the default InlineSmallCode.

Rerun JDK 9 benchmark with -XX:+UseParallelGC:

Benchmark                Mode  Cnt  Score   Error  Units
NewInstance.newInstance  avgt    5  8,728 ± 0,143  ns/op
NewInstance.operatorNew  avgt    5  4,822 ± 0,096  ns/op

G1 requires to put some barriers whenever an object store happens, that's why the compiled code becomes a bit larger, so that Class.newInstance exceeds the default InlineSmallCode limit. Another reason why compiled Class.newInstance has become larger is that the reflection code had been slightly rewritten in JDK 9.

TL;DR JIT has failed to inline Class.newInstance, because InlineSmallCode limit has been exceeded. The compiled version of Class.newInstance has become larger due to changes in reflection code in JDK 9 and because the default GC has been changed to G1.

Pooi answered 15/3, 2017 at 0:50 Comment(3)

Isn't this a big problem for reflection-heavy frameworks like Spring? Maybe this is worth a report so that the method can be made smaller (e.g. by extracting some code into a separate methods). – Minster 20/3, 2017 at 13:11

@KirillRakhman This should not be an issue since in the real life scenarios newInstance is unlikely to be inlined anyway. I can't imagine a reasonable case when the same constructor is called many times at the same place via reflection. In the original question the performance gain is seen only because JIT adapts to calling the specific method. – Pooi 20/3, 2017 at 23:31

great explanation, even has a tldr! – Stereoisomer 19/9, 2018 at 13:59

The implementation of Class.newInstance() is mostly identical, except the following part:

Java 8:

Constructor<T> tmpConstructor = cachedConstructor;
// Security check (same as in java.lang.reflect.Constructor)
int modifiers = tmpConstructor.getModifiers();
if (!Reflection.quickCheckMemberAccess(this, modifiers)) {
    Class<?> caller = Reflection.getCallerClass();
    if (newInstanceCallerCache != caller) {
        Reflection.ensureMemberAccess(caller, this, null, modifiers);
        newInstanceCallerCache = caller;
    }
}

Java 9

Constructor<T> tmpConstructor = cachedConstructor;
// Security check (same as in java.lang.reflect.Constructor)
Class<?> caller = Reflection.getCallerClass();
if (newInstanceCallerCache != caller) {
    int modifiers = tmpConstructor.getModifiers();
    Reflection.ensureMemberAccess(caller, this, null, modifiers);
    newInstanceCallerCache = caller;
}

As you can see, Java 8 had a quickCheckMemberAccess which allowed to bypass the expensive operations, like Reflection.getCallerClass(). This quick check has been removed, I’d guess, because it wasn’t compatible with the new module access rules.

But there’s more to it. The JVM might optimize reflective instantiations with a predictable type and Something.class.newInstance() refers to a perfectly predictable type. This optimization might have become less effective. There are several possible reasons:

the new module access rules complicate the process
since Class.newInstance() has been deprecated, some support has been deliberately removed (seems unlikely to me)
due to the changed implementation code shown above, HotSpot fails to recognize certain code patterns that trigger the optimizations

Holp answered 14/3, 2017 at 16:42 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags