Is there a way to speed up Javadoc (takes 7 minutes)
Asked Answered
S

6

46

I am building a Javadoc for a module with 2,509 classes. This currently takes 7 min or 6 files per second.

I have tried

mvn -T 1C install

However javadoc only uses 1 CPU. Is there a way to use more and/or speed up?

I am using Oracle JDK 8 update 112. My dev machine has 16 cores and 128 GB of memory.

Running flight recorder I can see that there is only one thread main

enter image description here

For those who are interested, I've used the following options:

<plugin>
    <artifactId>maven-javadoc-plugin</artifactId>
    <configuration>
        <additionalJOptions>
            <additionalJOption>-J-XX:+UnlockCommercialFeatures</additionalJOption>
            <additionalJOption>-J-XX:+FlightRecorder</additionalJOption>
            <additionalJOption>-J-XX:StartFlightRecording=name=test,filename=/tmp/myrecording-50.jfr,dumponexit=true</additionalJOption>
            <additionalJOption>-J-XX:FlightRecorderOptions=loglevel=debug</additionalJOption>
        </additionalJOptions>
    </configuration>
</plugin>

NOTE: One workaround is to do:

-Dmaven.javadoc.skip=true
Shannanshannen answered 16/12, 2016 at 17:12 Comment(24)
Profile the javadoc process. I would assume it's probably IO bound. So you could load the source onto a ramdisk or ssd.Daveta
@ElliottFrisch A good thought, the disk is 3% busy, but the CPU is almost exactly 100% (one cpu). I can profile it with Flight Recorder though, will update.Shannanshannen
CPU might be in IO wait and 100%.Daveta
On this machine 3.7% us, 0.2 sy, 0.0 ni, 96.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st.Shannanshannen
Perhaps doxygen is multithreaded and compatible with javadoc syntax ?Halliehallman
What jdk are you running? Have you measured the time for running javadoc directly?Leasehold
@Leasehold I am using OracleJDK 8 update 112. I am running the javadoc from maven but don't expect it to be faster without it. I have added a screen shot of the flight recorder report.Shannanshannen
Can you get the actual javadoc invocation? I just tried it on the 2k classes in the java package. Took 35 seconds so something seems off about your times.Leasehold
The Oracle javac compiler is not multithreaded, but the Eclipse compiler is. Can the Eclipse compiler perhaps generate javadoc too?Haemostatic
Does mvn -T 16 install behave differently?Daveta
@ElliottFrisch tried it and the difference was a second (possibly random variation)Shannanshannen
I'm thinking the -T controls how maven starts javac compiler processes, javadoc is a standalone tool. There are very few options documented, for example -verbose will tell you how long it's spending on each file.Daveta
You might be triggering a JavaDoc bug. Most of the time is spent in HashMap.put() and ClassMember.isEqual() which could indicate a poor hash code algorithm which leads to too many conflicts.Officiary
It's not necessarily related, but you invoke javadoc through mvn, so a maven speed-up may be worth a shot, i.e., export MAVEN_OPTS="-client -XX:+TieredCompilation -XX:TieredStopAtLevel=1 -Xverify:none" (cf. this blog). I don't have too much hope about that, but who knows?Abloom
What version of maven are you using? Did you set any MAVEN_OPTS? What version of maven-javadoc-plugin are you using?Relativistic
@Relativistic I haven't set any OPTS, the process is long running (minutes) so I am not sure this will help. I am using version 2.10.3 of the plugin.Shannanshannen
Maven is running in java, so if you set MAVEN_OPTS will help the JVM (ex: -Xms256m -Xmx512m). What is the version of maven you are using?Relativistic
See this link if it helps : issues.apache.org/jira/browse/LUCENE-5282Gibbet
interesting. is it safe to assume that you are invoking mvn javadoc:javadoc?Foretaste
@Foretaste correct.Shannanshannen
@PeterLawrey that is very interesting, I've tried running that command for sources of JMH - 3618 of classes, around 12 seconds. I'm running 3.0.1 version of the plugin.Foretaste
@Foretaste I will try updating the plugin. I suspect the problem is the number of relationships between classes.Shannanshannen
@PeterLawrey for the record, I've also tried around 5 other projects I have from openjdk and our 10 of internals ones - some modules, all much above 2k classes... it's most probably the data itself in your project that triggers a weird path. plz post back with results once you doForetaste
@PeterLawrey is it possible to generate javadocs individually for each submodule/subpackage and then assemble these parts?Ruiz
P
6

Running maven with -T1C will cause maven to try to build modules in parallel, so if you have a multi-module project, at best it will build each module's javadoc in parallel (if your dependency graph between modules allow it).

The javadoc process itself is single-threaded, so you won't be able to use multiple cores to generate the javadoc of one single module.

However, since you have many classes (and possibly many @link doclets or similar ?), maybe the javadoc process could benefit from extended heap. Have you looked into GC activity ? Try adding this in your configuration, see if it helps :

<additionalJOption>-J-Xms2g</additionalJOption>
<additionalJOption>-J-Xmx2g</additionalJOption>
Phaih answered 2/2, 2017 at 9:14 Comment(2)
I could check the memory size is not limited. The default should be 32 GB on this machine.Shannanshannen
@PeterLawrey the problem may not be the limits, but the starting size. JVM will extend memory a little only after every full GC, so a reasonable amount of Xms will let JVM avoid too much GC before it extends the memory enough for your workloadDistressful
W
6

@lbndev is right, at least with the default Doclet (com.sun.tools.doclets.formats.html.HtmlDoclet) that is supplied with Javadoc. A look through the source confirms the single threaded implementation:

(Those links are to JDK 8 source. With JDK 11 the classes have moved, but the basic for loops in HtmlDoclet and AbstractDoclet are still there.)

Some sample based profiling confirmed these are the methods that are the bottleneck: Javadoc-profiling

This won't be what you're hoping to hear, but this looks like no option in the current standard Javadoc for multi-threading, at least within a single Maven module.

generateClassFiles() etc would lend themselves well to a bit of multithreading, though this would probably need to be a change in the JDK. As mentioned below AbstractDoclet.isValidDoclet() even actively blocks subclassing of HtmlDoclet. Trying to reimplement some of those loops as a third party would need to pull in a lot of other code.

A scan around other Doclet implementations (e.g. javadown) only found a similar implementation style around the package and class drilldown. It's possible others on this thread will know more.

Thinking a bit more widely, there might be room for tuning around DocFileFactory. It's clearly marked up as an internal class (not even public in the package), but it does abstract the writing of the (HTML) files. It seems possible an alternative version of this could buffer the HTML in memory, or stream directly to a zip file, to improve the IO performance. But clearly this would also need to understand the risk of change in the JDK tools.

Wist answered 6/10, 2018 at 15:59 Comment(3)
Hmmm, a crafty subclass of HtmlDoclet could override such as generateClassFiles() and introduce an executor. It's going to need to be JDK specific though. Can I check which JDK is the target now?Wist
Good luck with that, with all the singletons, stateful classes and javac. It's futile.Acquirement
Well ... the idea was to jump in later with a new doclet implementation - as the -doclet parameter. Looks like the implementers of HtmlDoclet saw this possibility and locked it down though. AbstractDoclet.isValidDoclet() checks the fully qualified classname of the subclass is com.sun.tools.doclets.formats.html.HtmlDoclet. It's private, and called from an internal method, so there would be lots to reimplement. Similar in JDK 11.Wist
C
0

javadoc, and the standard doclet, are currently fundamentally single-threaded.

It is "on the radar" to improve this, primarily by generating pages in parallel, but this means retrofitting MT-safeness to various shared data structures.

Cheap answered 8/10, 2018 at 18:47 Comment(0)
M
0

You can have Maven to use multiple threads per core in all the cores.

For eg.

mvn -T 4C install # will use 4 threads per available CPU core

You can change 4 above to whatever number you want. You have a machine with lots of resources. Try 8 or 16.

Also have you tried using javadoc-no-fork ? This will ensure javadoc is not triggered second time - https://maven.apache.org/plugins/maven-javadoc-plugin/examples/javadoc-nofork.html

Mcshane answered 10/10, 2018 at 14:49 Comment(4)
This doesn't change the way the javadoc plugin works.Shannanshannen
Have added info on javadoc-no-forkMcshane
Sounds worth a try.Shannanshannen
@PeterLawrey did you try?Mcshane
A
0

Maven customization is a way to speed up javadoc generation.

Another approach would be to change the doclet used for generating the javadoc. The maven javadoc plugin allow you to change the doclet used to generate the javadoc

https://maven.apache.org/plugins/maven-javadoc-plugin/examples/alternate-doclet.html

I did found the following commercial doclet (I'm not affiliated with them in any way) wich claims to be faster than traditional javadoc. It offers a free/trial/commercial license. If you're realy eager to speed up your javadoc build maybe it is worth to look if it's worth the price

http://www.filigris.com/docflex-javadoc

Maybe opensource alternatives exists on internet...

Autoicous answered 11/10, 2018 at 14:25 Comment(0)
C
-3

Use doxygen instead of the regular mvn, what you are using now.

Comintern answered 12/1, 2017 at 23:47 Comment(1)
Does that speed up javadoc?Shannanshannen

© 2022 - 2024 — McMap. All rights reserved.