EDIT: I just wanted to clarify that, in the interest of asking a question that has a known, objective answer, this question is, What does "Killing self fork JVM. PING timeout elapsed" actually mean, e.g. what is pinging what, and why failsafe decide it should exit the test process? Since this is StackOverflow, please don't respond with advice that fixes some VM exits, especially those that result in different behavior from what we see below. E.g. there is no OutOfMemoryError in the console, so I don't think the VM is running out of heap space. If you do answer that way, an SO admin may misunderstand my question as a result and lock it or close it.
We sometimes have a VM crash in a CI build, e.g.:
[INFO] Results:
[INFO]
[WARNING] Tests run: 8152, Failures: 0, Errors: 0, Skipped: 31
...
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-failsafe-plugin:2.22.1:verify (integration-test) on project app_server: There are test failures.
[ERROR]
[ERROR] Please refer to /builds/App/Development/App/app_server/target/surefire-reports for the individual test results.
[ERROR] Please refer to dump files (if any exist) [date].dump, [date]-jvmRun[N].dump and [date].dumpstream.
[ERROR] org.apache.maven.surefire.booter.SurefireBooterForkException: The forked VM terminated without properly saying goodbye. VM crash or System.exit called?
[ERROR] Command was /bin/sh -c cd /builds/App/Development/App/app_server && /usr/lib/jvm/java-1.8-openjdk/jre/bin/java -Xmx3g -jar /builds/App/Development/App/app_server/target/surefire/surefirebooter7662621916357034130.jar /builds/App/Development/App/app_server/target/surefire 2019-01-09T21-23-07_397-jvmRun1 surefire1770987927673067492tmp surefire_37459604808221437221tmp
[ERROR] Error occurred in starting fork, check output in log
[ERROR] Process Exit Code: 1
[ERROR] Crashed tests:
[ERROR] com.company.blah.blah.ITSomeIntegrationTests
[ERROR] at org.apache.maven.plugin.surefire.booterclient.ForkStarter.fork(ForkStarter.java:669)
[ERROR] at org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:282)
[ERROR] at org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:245)
[ERROR] at org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1183)
[ERROR] at org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:1011)
[ERROR] at org.apache.maven.plugin.surefire.AbstractSurefireMojo.execute(AbstractSurefireMojo.java:857)
[ERROR] at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:137)
[ERROR] at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)
[ERROR] at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:154)
[ERROR] at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:146)
[ERROR] at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:117)
[ERROR] at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:81)
[ERROR] at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:56)
[ERROR] at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
[ERROR] at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:305)
[ERROR] at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:192)
[ERROR] at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:105)
[ERROR] at org.apache.maven.cli.MavenCli.execute(MavenCli.java:954)
[ERROR] at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:288)
[ERROR] at org.apache.maven.cli.MavenCli.main(MavenCli.java:192)
[ERROR] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[ERROR] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[ERROR] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[ERROR] at java.lang.reflect.Method.invoke(Method.java:498)
[ERROR] at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
[ERROR] at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)
[ERROR] at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
[ERROR] at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)
(Of course, "There are test failures" is wrong, as there were no failed tests.)
What I want to know first is, what is failsafe trying to tell us? Here's some information we gather:
First, there is no stack dump or heap dump, but there are .dump files left behind by surefire and failsafe. For the failed project, there's always a .dump file like:
# Created at 2019-02-12T14:31:16.410
System.exit() or native command error interrupted process checker.
java.lang.IllegalStateException: Cannot use PPID 158 process information. Going to use NOOP events.
at org.apache.maven.surefire.booter.PpidChecker.checkProcessInfo(PpidChecker.java:155)
at org.apache.maven.surefire.booter.PpidChecker.isProcessAlive(PpidChecker.java:124)
at org.apache.maven.surefire.booter.ForkedBooter$2.run(ForkedBooter.java:214)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
# Created at 2019-02-12T14:47:59.174
Killing self fork JVM. PING timeout elapsed.
(Is this a breakdown in communication between the failsafe and Maven processes?)
Also, we print the stack trace when System.exit() is called, and in every such failure it looks like:
java.lang.Exception: System.exit() or similar method called:
at com.app.IntegrationTestSetup$1.checkPermission(IntegrationTestSetup.java:78)
at java.lang.SecurityManager.checkExit(SecurityManager.java:761)
at java.lang.Runtime.halt(Runtime.java:273)
at org.apache.maven.surefire.booter.ForkedBooter.kill(ForkedBooter.java:311)
at org.apache.maven.surefire.booter.ForkedBooter.kill(ForkedBooter.java:305)
at org.apache.maven.surefire.booter.ForkedBooter.access$300(ForkedBooter.java:68)
at org.apache.maven.surefire.booter.ForkedBooter$5.run(ForkedBooter.java:285)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
i.e. some surefire code (used by the failsafe plugin, in this case) killed the failsafe process (the JVM that surefire / failsafe starts for tests in a process that is a child of the Maven process).
We're using version 2.22.1 of both failsafe and surefire.
# mvn -v
Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z)
Maven home: /usr/share/java/maven-3
Java version: 1.8.0_191, vendor: Oracle Corporation, runtime: /usr/lib/jvm/java-1.8-openjdk/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "4.15.0-45-generic", arch: "amd64", family: "unix"