Java8 slow compiling for interfaces with thousands of default methods with the same name
Asked Answered
A

3

7

given the interfaces (which are very large and generated out of language definitions):

interface VisitorA {
   default void visit(ASTA1 node) {...}
   ...
   default void visit(ASTA2000 node) {...}
}

interface VisitorB extends VisitorA {
   default void visit(ASTB1 node) {...}
   ...
   default void visit(ASTB1000 node) {...}

   // due to language embedding all visit methods of VisitorA
   // must be overwritten
   @Override
   default void visit(ASTA1 node) {...}
   ...
   @Override
   default void visit(ASTA2000 node) {...}
}

interface VisitorC extends VisitorA {
   default void visit(ASTC1 node) {...}
   ...
   default void visit(ASTC1000 node) {...}

   // due to language embedding all visit methods of VisitorA
   // must be overwritten
   @Override
   default void visit(ASTA1 node) {...}
   ...
   @Override
   default void visit(ASTA2000 node) {...}
}

interface VisitorD extends VisitorB, VisitorC {
   default void visit(ASTD1 node) {...}
   ...
   default void visit(ASTD1000 node) {...}

   // due to language embedding all visit methods of VisitorA,
   // VisitorB, and VisitorC must be overwritten
   @Override
   default void visit(ASTA1 node) {...}
   ...
   @Override
   default void visit(ASTA2000 node) {...}

   @Override
   default void visit(ASTB1 node) {...}
   ...
   @Override
   default void visit(ASTB1000 node) {...}

   @Override
   default void visit(ASTC1 node) {...}
   ...
   @Override
   default void visit(ASTC1000 node) {...}
}

Now compiling the interface VisitorA (containing of about 2.000 overloaded methods) needs about 10s. Compiling the interfaces VisitorB and VisitorC needs each about 1.5 min. But when we try to compile the interface VisitorD, the Java 8 compiler needs about 7 minutes!

  • Has anybody an idea why it needs so much time to compile VisitorD?
  • Is it because of the inheritance of the default methods?
  • Or is it because of the diamond constellation, VisitorB as well as VisitorC extend both VisitorA and VisitorD extends VisitorB and VisitorC again?

We already tried around and the following solution helped a little bit:

 interface VisitorAPlain {
   void visit(ASTA1 node);
   ...
   void visit(ASTA2000 node);
}

interface VisitorA extends VisitorAPlain {
   ... // has same default methods as VisitorA above
}

interface VisitorBPlain extends VisitorAPlain {
   void visit(ASTB1 node);
   ...
   void visit(ASTB1000 node);
}

interface VisitorB extends VisitorBPlain {
   ... // has same default methods as VisitorB above
}

interface VisitorCPlain extends VisitorAPlain {
   void visit(ASTC1 node);
   ...
   void visit(ASTC1000 node);
}

interface VisitorC extends VisitorCPlain {
   ... // has same default methods as VisitorC above
}

interface VisitorD extends VisitorBPlain, VisitorCPlain {
   default void visit(ASTD1 node) {...}
   ...
   default void visit(ASTD1000 node) {...}

   // due to language embedding all visit methods of VisitorAPlain,
   // VisitorBPlain, and VisitorCPlain must be overwritten
   @Override
   default void visit(ASTA1 node) {...}
   ...
   default void visit(ASTA2000 node) {...}

   @Override
   default void visit(ASTB1 node) {...}
   ...
   default void visit(ASTB1000 node) {...}

   @Override
   default void visit(ASTC1 node) {...}
   ...
   default void visit(ASTC1000 node) {...}
}

And now the compilation time of the visitorD needs only about 2 minutes. But still this is a lot.

  • Has anybody an idea how to reduce the compilation time of VisitorD to a few seconds?
  • If we remove the two extends relation of VisitorD, extends VisitorBPlain, VisitorCPlain, then the compilation time of this interface needs about 15s - even though it has about 5.000 default methods. But we need the that VisitorD is compatible to VisitorB and VisitorC (either by direct extension or the indirect one with the intermediate Plain-interfaces) for casting reasons.

I also read the answers to the similar question: slow JDK8 compilation but there the problem seemed to be the with generic type inference: "There's a severe performance regression in Java 8 when it comes to overload resolution based on generic target typing."

So this is kind of different, if anybody would have a tip or a good explanation why it is so; I would be very thankful.

Thank you, Michael

Asben answered 23/8, 2016 at 22:8 Comment(8)
Sorry - don't have an answer but I'm curious - how big are these files? I've got a project with about 1000 files totaling about 150,000 lines and, with Maven, it takes a bit over 15 seconds to compile. You must have some major files.Proximity
This file is very large. It has about 270kB large. I uploaded it, so you can see by yourself: drive.google.com/open?id=0B6L6K365bELNbXFhZVp6MG55RU0Asben
In our constellation, the generated Visitor files have about 500 to 2.000 methods in one file. And as in the example link above, one Delegation Visitor mostly extends one or two other delegation visitor also having ca. 500 to 2.000 methods in one file. And then the there are several extension steps: in general language extension (and so also the visitor extension) is: Java extends Common, MontiArc extends Java, MontiArcBehavior extends MontiArc, Automaton extends Common, AutomatonJava extends Automaton and Java, MontiArcAutomaton (the uploaded file) extends MontiArcBehavior and AutomatonJavaAsben
This is caused by all the methods having the same name. So to check the override is correct (not accidentally overriding a bridge), each method has to be checked against each other, which gets quadratic (or worse). Having thousands of methods with the same name in a class puts a lot of stress on overload selection / checking, which is why you're seeing this. (Note that this doesn't come up too often in hand-written code, just generated code.)Eanore
@BrianGoetz What would you suggest us? Not to use inheritance in this case, and let the generator copy all the methods from one base class to the inherited class; then we do not have overwritten methods anymore -- and it should become faster, or? But what also surprises me is that I thought overriding a bridge must only be checked if any of your base classes has generics; but all the Visitor classes are generated and none of them contain any generics - or is the check done anyway?Asben
Name the methods visitAst1(AST1). (This is what most visitors do anyway.)Eanore
But than double dispatching does not work anymore. Or how did you solve it?Asben
The accept(Visitor v) method in AST1 delegates to v.visitAst1(this).Eanore
A
0

we figured out how to solve the problem for us: We had a bug in the generator since the overloaded inherited method had the same method body as the one inherited from.

This would mean for us we have two methods how to solve it:

  • (a) do not generate the methods which we inherited anymore
  • (b) generate all methods, but delete the interface inheritance

The interesting thing is that (a) needs more compile time than (b).

I did an experiment on my Mac to represent the results we found during our fixing process, which you can download at: https://drive.google.com/open?id=0B6L6K365bELNWDRoeTF4RXJsaFk

I just describe the basic files of the experiment here, and the results. Maybe anybody finds it useful.

Version 1 is (b) and looks like:

DelegatorVisitorA.java

interface DelegatorVisitorA extends VisitorA {
  VisitorA getVisitorA();  

  default void visit(AST_A1 node) {
    getVisitorA().visit(node);
  }
  ...
  default void visit(AST_A49 node) {
    getVisitorA().visit(node);
  }
}

DelegatorVisitorB.java

interface DelegatorVisitorB extends VisitorB {
  VisitorA getVisitorA();  
  default void visit(AST_A1 node) {
    getVisitorA().visit(node);
  }
  ...
  default void visit(AST_A49 node) {
    getVisitorA().visit(node);
  }
  VisitorB getVisitorB();  

  default void visit(AST_B1 node) {
    getVisitorB().visit(node);
  }
  ...
  default void visit(AST_B49 node) {
    getVisitorB().visit(node);
  }
}

DelegatorVisitorC.java

interface DelegatorVisitorC extends VisitorC {
  VisitorA getVisitorA();
  default void visit(AST_A1 node) {
    getVisitorA().visit(node);
  }
  ...
  default void visit(AST_A49 node) {
    getVisitorA().visit(node);
  }
  VisitorB getVisitorB();  
  default void visit(AST_B1 node) {
    getVisitorB().visit(node);
  }
  ...
  default void visit(AST_B49 node) {
    getVisitorB().visit(node);
  }
  VisitorC getVisitorC();  
  default void visit(AST_C1 node) {
    getVisitorC().visit(node);
  }
  ...
  default void visit(AST_C49 node) {
    getVisitorC().visit(node);
  }
}

Version 2 is (a) and looks like:

DelegatorVisitorA.java same as in Version 1

DelegatorVisitorB.java

interface DelegatorVisitorB extends VisitorB , DelegatorVisitorA{
  VisitorB getVisitorB();
  default void visit(AST_B1 node) {
    getVisitorB().visit(node);
  }
  ...
  default void visit(AST_B49 node) {
    getVisitorB().visit(node);
  }
}

DelegatorVisitorC.java

interface DelegatorVisitorC extends VisitorC , DelegatorVisitorB{
  VisitorB getVisitorB();
  default void visit(AST_B1 node) {
    getVisitorB().visit(node);
  }
  ...
  default void visit(AST_B49 node) {
    getVisitorB().visit(node);
  }
}

Version 3 (an intermediate step we had, but it is also wrong) looks like:

DelegatorVisitorA.java same as in Version 1

DelegatorVisitorB.java

interface DelegatorVisitorB extends VisitorB , DelegatorVisitorA{
  VisitorB getVisitorB();
  default void visit(AST_B1 node) {
    getVisitorB().visit(node);
  }
  ...
  default void visit(AST_B49 node) {
    getVisitorB().visit(node);
  }
}

DelegatorVisitorC.java

interface DelegatorVisitorC extends VisitorC , DelegatorVisitorA, DelegatorVisitorB{
  VisitorB getVisitorB();
  default void visit(AST_B1 node) {
    getVisitorB().visit(node);
  }
  ...
  default void visit(AST_B49 node) {
    getVisitorB().visit(node);
  }
}

Version 4 (the old version which caused this post) looks like:

DelegatorVisitorA.java same as in Version 1

DelegatorVisitorB.java

interface DelegatorVisitorB extends VisitorB , DelegatorVisitorA{
  VisitorA getVisitorA();  
  default void visit(AST_A1 node) {
    getVisitorA().visit(node);
  }
  ...
  default void visit(AST_A49 node) {
    getVisitorA().visit(node);
  }
  VisitorB getVisitorB();  

  default void visit(AST_B1 node) {
    getVisitorB().visit(node);
  }
  ...
  default void visit(AST_B49 node) {
    getVisitorB().visit(node);
  }
}

DelegatorVisitorC.java

interface DelegatorVisitorC extends VisitorB , DelegatorVisitorA, DelegatorVisitorB{
  VisitorA getVisitorA();
  default void visit(AST_A1 node) {
    getVisitorA().visit(node);
  }
  ...
  default void visit(AST_A49 node) {
    getVisitorA().visit(node);
  }
  VisitorB getVisitorB();  
  default void visit(AST_B1 node) {
    getVisitorB().visit(node);
  }
  ...
  default void visit(AST_B49 node) {
    getVisitorB().visit(node);
  }
  VisitorC getVisitorC();  
  default void visit(AST_C1 node) {
    getVisitorC().visit(node);
  }
  ...
  default void visit(AST_C49 node) {
    getVisitorC().visit(node);
  }
}

Here I only showed DelegatorVisitorA.java, DelegatorVisitorB.java and DelegatorVisitorC.java in the different versions. The other delegator visitors DelegatorVisitorD.java to DelegatorVisitorI.java follow the same pattern. (DelegatorVisitorI belongs to the language I which extends the language H. Language H has DelegatorVisitorH and language H extends language G, and so on.)

The results for compiling DelegatorVisitorI.java generated in the four different versions as described above needs so much time:

The results are:

Version 1:
103-240:srcV1 michael$ time javac DelegatorVisitorI.java

real    0m1.859s
user    0m5.023s
sys 0m0.175s



Version 2:
103-240:srcV2 michael$ time javac DelegatorVisitorI.java

real    0m3.364s
user    0m7.713s
sys 0m0.342s



Version 3:
103-240:srcV3 michael$ time javac DelegatorVisitorI.java

real    2m58.009s
user    2m56.787s
sys 0m1.718s



Version 4:
103-240:srcV4 michael$ time javac DelegatorVisitorI.java

real    14m14.923s
user    14m3.738s
sys 0m5.141s

The Java files of all the four different versions have the same behavior, but due to duplicated code the compile process needs much longer.

Also interesting is that, if you copy the method and do not use any inheritance than the compilation is the fastest, even the files become much bigger in after a very-long inheritance chain.

(The large time difference between version 2 and version 3 I personally cannot understand, maybe it is a bug in the analyzes process of the javac compiler.)

Asben answered 9/3, 2017 at 19:47 Comment(0)
A
2

The credit for this answer goes to @Brian Goetz.

I created a dummy test, where once all the visit methods were overwritten and overloaded, at the other time where the visitX methods got different names.

And the outcome was more amazing than I thought: When overloading and overwriting the visit methods, the compiler needed nearly 30 minutes! When I renamed the visit methods uniquely inside one visitor class, the compiler needed only 46 seconds.

Here is the source code for the dummy test: https://drive.google.com/open?id=0B6L6K365bELNUkVYMHZnZ0dGREk

And here are the screenshots for the compile time at my computer: VisitorN contains overloaded and overwritten visit methods. VisitorG contains the optimized visitX methods, which are only overwritten but not overloaded anymore. <code>VisitorN</code> contains overloaded and overwritten <code>visit</code> methods <code>VisitorG</code> contains the optimized <code>visitX</code> methods, which are only overwritten but not overloaded anymore

Using the "plain" approach with different visitX methods, then compiling Visitor_S and VisitorPlain_S needs only about 22 seconds (being twice as fast as the approach with overloading directly the default visitX methods). Visitor_S has default methods, but it extends VisitorPlain_S having no default methods. VisitorPlain_S extends other "plain" visitors without default methods. <code>Visitor_S</code> has <code>default</code> methods, but it extends <code>VisitorPlain_S</code> having no <code>default</code> methods. <code>VisitorPlain_S</code> extends other "plain" visitors without <code>default</code> methods.

But what I do still not understand -- just for my theoretical interest, is the fact with the bridge methods: In https://docs.oracle.com/javase/tutorial/java/generics/bridgeMethods.html bridge methods only occur doe to type erasing, but in the example we had no generics and so type erasing should not play a role at all. - Maybe anybody has a good explanation why it still maters.

Asben answered 25/8, 2016 at 22:50 Comment(0)
A
1

After an extra meeting just for this problem, we figured out the following limitations of the first answer:

The first answer works very good for "static" visitors, as they are used in ANTLR, because there you do not have language interfaces, and so the visit method knows exactly the children ASTTypes. In MontiCore we can define an interface grammar element which is will be explained here right now:

grammar MontiArc {
  MontiArc = "component" Name "{" ArcElement* "}";
  interface ArcElement;
  Port implements ArcElement = "port" ... ;
}

grammar MontiArcAutomaton extends MontiArc {
  Automaton implements ArcElement = State | Transition;
  State = "state" ... ;
  Transition = ... "->" ...;
}

The Visitor for MontiArcAST does not know exactly which accept method should be invoked, since you do not know whether you should call PortAST#accept or even the not known method State#accept, which will be introduced later on due to grammar extension. That is why we use "double dispatching", but therefore the visit methods must have the same name (since we could not know the method visitState(StateAST node) which is not there when we generate the visitor for the MontiArc grammar.

We thought about generating visitX method and delegate to this method from the general visit method using a large instanceof-if-cascade. But this would require to add additional if statements to the visit(MontiArcAST node) after deploying our jar-File of the grammar MontiArc, and this would destroy our modlarity.

We will try to analyze the problem further, and I will keep you up-to-date if we found a new methology how to generate large dynamic visitors.

Asben answered 26/8, 2016 at 11:0 Comment(0)
A
0

we figured out how to solve the problem for us: We had a bug in the generator since the overloaded inherited method had the same method body as the one inherited from.

This would mean for us we have two methods how to solve it:

  • (a) do not generate the methods which we inherited anymore
  • (b) generate all methods, but delete the interface inheritance

The interesting thing is that (a) needs more compile time than (b).

I did an experiment on my Mac to represent the results we found during our fixing process, which you can download at: https://drive.google.com/open?id=0B6L6K365bELNWDRoeTF4RXJsaFk

I just describe the basic files of the experiment here, and the results. Maybe anybody finds it useful.

Version 1 is (b) and looks like:

DelegatorVisitorA.java

interface DelegatorVisitorA extends VisitorA {
  VisitorA getVisitorA();  

  default void visit(AST_A1 node) {
    getVisitorA().visit(node);
  }
  ...
  default void visit(AST_A49 node) {
    getVisitorA().visit(node);
  }
}

DelegatorVisitorB.java

interface DelegatorVisitorB extends VisitorB {
  VisitorA getVisitorA();  
  default void visit(AST_A1 node) {
    getVisitorA().visit(node);
  }
  ...
  default void visit(AST_A49 node) {
    getVisitorA().visit(node);
  }
  VisitorB getVisitorB();  

  default void visit(AST_B1 node) {
    getVisitorB().visit(node);
  }
  ...
  default void visit(AST_B49 node) {
    getVisitorB().visit(node);
  }
}

DelegatorVisitorC.java

interface DelegatorVisitorC extends VisitorC {
  VisitorA getVisitorA();
  default void visit(AST_A1 node) {
    getVisitorA().visit(node);
  }
  ...
  default void visit(AST_A49 node) {
    getVisitorA().visit(node);
  }
  VisitorB getVisitorB();  
  default void visit(AST_B1 node) {
    getVisitorB().visit(node);
  }
  ...
  default void visit(AST_B49 node) {
    getVisitorB().visit(node);
  }
  VisitorC getVisitorC();  
  default void visit(AST_C1 node) {
    getVisitorC().visit(node);
  }
  ...
  default void visit(AST_C49 node) {
    getVisitorC().visit(node);
  }
}

Version 2 is (a) and looks like:

DelegatorVisitorA.java same as in Version 1

DelegatorVisitorB.java

interface DelegatorVisitorB extends VisitorB , DelegatorVisitorA{
  VisitorB getVisitorB();
  default void visit(AST_B1 node) {
    getVisitorB().visit(node);
  }
  ...
  default void visit(AST_B49 node) {
    getVisitorB().visit(node);
  }
}

DelegatorVisitorC.java

interface DelegatorVisitorC extends VisitorC , DelegatorVisitorB{
  VisitorB getVisitorB();
  default void visit(AST_B1 node) {
    getVisitorB().visit(node);
  }
  ...
  default void visit(AST_B49 node) {
    getVisitorB().visit(node);
  }
}

Version 3 (an intermediate step we had, but it is also wrong) looks like:

DelegatorVisitorA.java same as in Version 1

DelegatorVisitorB.java

interface DelegatorVisitorB extends VisitorB , DelegatorVisitorA{
  VisitorB getVisitorB();
  default void visit(AST_B1 node) {
    getVisitorB().visit(node);
  }
  ...
  default void visit(AST_B49 node) {
    getVisitorB().visit(node);
  }
}

DelegatorVisitorC.java

interface DelegatorVisitorC extends VisitorC , DelegatorVisitorA, DelegatorVisitorB{
  VisitorB getVisitorB();
  default void visit(AST_B1 node) {
    getVisitorB().visit(node);
  }
  ...
  default void visit(AST_B49 node) {
    getVisitorB().visit(node);
  }
}

Version 4 (the old version which caused this post) looks like:

DelegatorVisitorA.java same as in Version 1

DelegatorVisitorB.java

interface DelegatorVisitorB extends VisitorB , DelegatorVisitorA{
  VisitorA getVisitorA();  
  default void visit(AST_A1 node) {
    getVisitorA().visit(node);
  }
  ...
  default void visit(AST_A49 node) {
    getVisitorA().visit(node);
  }
  VisitorB getVisitorB();  

  default void visit(AST_B1 node) {
    getVisitorB().visit(node);
  }
  ...
  default void visit(AST_B49 node) {
    getVisitorB().visit(node);
  }
}

DelegatorVisitorC.java

interface DelegatorVisitorC extends VisitorB , DelegatorVisitorA, DelegatorVisitorB{
  VisitorA getVisitorA();
  default void visit(AST_A1 node) {
    getVisitorA().visit(node);
  }
  ...
  default void visit(AST_A49 node) {
    getVisitorA().visit(node);
  }
  VisitorB getVisitorB();  
  default void visit(AST_B1 node) {
    getVisitorB().visit(node);
  }
  ...
  default void visit(AST_B49 node) {
    getVisitorB().visit(node);
  }
  VisitorC getVisitorC();  
  default void visit(AST_C1 node) {
    getVisitorC().visit(node);
  }
  ...
  default void visit(AST_C49 node) {
    getVisitorC().visit(node);
  }
}

Here I only showed DelegatorVisitorA.java, DelegatorVisitorB.java and DelegatorVisitorC.java in the different versions. The other delegator visitors DelegatorVisitorD.java to DelegatorVisitorI.java follow the same pattern. (DelegatorVisitorI belongs to the language I which extends the language H. Language H has DelegatorVisitorH and language H extends language G, and so on.)

The results for compiling DelegatorVisitorI.java generated in the four different versions as described above needs so much time:

The results are:

Version 1:
103-240:srcV1 michael$ time javac DelegatorVisitorI.java

real    0m1.859s
user    0m5.023s
sys 0m0.175s



Version 2:
103-240:srcV2 michael$ time javac DelegatorVisitorI.java

real    0m3.364s
user    0m7.713s
sys 0m0.342s



Version 3:
103-240:srcV3 michael$ time javac DelegatorVisitorI.java

real    2m58.009s
user    2m56.787s
sys 0m1.718s



Version 4:
103-240:srcV4 michael$ time javac DelegatorVisitorI.java

real    14m14.923s
user    14m3.738s
sys 0m5.141s

The Java files of all the four different versions have the same behavior, but due to duplicated code the compile process needs much longer.

Also interesting is that, if you copy the method and do not use any inheritance than the compilation is the fastest, even the files become much bigger in after a very-long inheritance chain.

(The large time difference between version 2 and version 3 I personally cannot understand, maybe it is a bug in the analyzes process of the javac compiler.)

Asben answered 9/3, 2017 at 19:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.