Is it possible to compile a large Java module in parallel?
Asked Answered
I

2

9

I know that multiple modules can be compiled using multiple threads where each thread compiles a single module but what if I have a single large module? Does Javac or the Eclipse Java Compiler support compiling single modules in parallel (using many threads)? Or is there any other Java compiler which supports it?

Update: I created a Java source file with ~50k simple methods (just for the purpose of this test) such as:

    static int add1(int a, int b, int c) {
        return 2 * a + 55 * b - c;
    }

   static int add2(int a, int b, int c) {
        return 2 * a + 55 * b - c;
    }

   static int add3(int a, int b, int c) {
        return 2 * a + 55 * b - c;
    }

These methods do not depend on each other so compilation could be done in parallel (at least in theory). Compiling this file with Javac on my 12 core + HT machine lead to an average 20% CPU usage with a really short spike of up to 50%. This leads me to believe that although there is some parralelization done inside Javac, it is really minor.

The interesting thing is that if I create 2, 3 or 4 classes with the same number of methods and compile them at the same time with a single Javac process, I cannot get a higher CPU usage. The compilation takes exactly 2x, 3x, 4x longer which shows that Javac doesn't compile these totally unrelated classes in parallel. But if I start separate Javac processes to compile these files separately, the CPU jumps to almost 100% when 4 files(=Javac processes) are used and the compilation time is just 5-10% higher than compiling a single file (compared to this, a single Javac process compiling all these 4 files, the compilation takes 400% longer).

So my opinion is that Javac does compile files using multiple threads but it is kind of limited to ~4 threads, it cannot fully utilize a 12 cores machine. Also to me it seems that Javac compiles multiple files in serial, it only uses cores/threads to compile a single file in parallel(I believe that when a single file is compiled, some parts can be done in parallel and this is what Javac does, but what about compiling multiple files in parallel? If I have 100 files which are independent I should be able to see my CPU jump to 100% which is not the case.)

Isochromatic answered 27/12, 2020 at 14:5 Comment(10)
maven supports parallel build cwiki.apache.org/confluence/display/MAVEN/…Consubstantiate
@Consubstantiate as far as I know it only works if the project has multiple modulesIsochromatic
There are only two compilers tracking the latest versions of java in wide use, which would be important to me. The openjdk one and the eclipse one. I would have a very close look at the eclipse one to see what it can do.Zygotene
You have a class with 50000 methods?Urinary
@Urinary I don't. I just created it for the purpose of this test.Isochromatic
You might want to check out the following post because it has details that could be interesting: why parallel execution on java compile take linear growth in timeUrinary
@Urinary thanks, will take a lookIsochromatic
How would the compiler know that two Java files are independent of each other without first compiling them? (Remember, if class A depends on class B, and both classes are currently in source form, while compiling A the compiler will see that there is a B.java file without a B.class file, and compile B.java.)Autotrophic
@Autotrophic this is a fair point, but I still think that Javac could do better, like it could compile files in multiple steps and during a very fast first step it could already know that the file is independent.Isochromatic
now think about how many users are really affected by this? and take it into a different perspective. 20 years in and HashSet still uses a HashMap under the hood (I can make more examples like this). The idea is that such optimizations might not ever be a priority to the JDK team. Unless a VERY brave and smart enthusiast wants to help with a PR (which will never happen)Intramundane
I
7

Yes it is possible to build Java code in parallel.

The Java compiler (javac) itself doesn't do this, but both Maven and Ant (and some versions of Make) can run multiple javac instances in parallel.

Furthermore, the Eclipse Java compiler is multi-threaded and you can tell Maven to use it instead of javac; see https://mcmap.net/q/485219/-using-multiple-cores-processors-when-compiling-java


I note that your example involves compiling a single class with a huge number of methods. Parallel compiler instances won't help with that. The Eclipse compiler might help depending on how it is implemented.

However, I put it to you that that is an unrealistic example. People don't write code like that in real life1, and code generators can (and should) be written to not emit source code like that.

1 - Their co-workers will rebel ...

Iamb answered 3/1, 2021 at 15:32 Comment(0)
S
3

javac runs always single-threaded. There is a case for improving javac performance JDK-4229449 : RFE: Please multithread javac for better performance however Oracle does not intend to change the compilation architecture.

Stefanysteffane answered 29/12, 2020 at 23:8 Comment(4)
You're right. This was submitted in 1999!! and it was marked as Closed, Resolution: Won't Fix. What a pity!Isochromatic
The way of mind is one. I also think multithreaded compilation can be usefulStefanysteffane
This was submitted in 1999!! and it was marked as Closed. Yes, but only because the issue was too specific how performance should be improved. From the comments: Yes, the compiler should be faster. [...] But this RFE is a bit too specific in suggesting how we do that.. I think Java compiler developers do consider compilation performance in general.Orman
There is a newer issue here with status closed/delivered JEP 139Orman

© 2022 - 2024 — McMap. All rights reserved.