Unnecessary Full GC with the G1 garbage collector in Java 8?
Asked Answered
G

1

7

We noticed occasional full GC’s with G1 garbage collector with concurrent-mark overflow. Once, there is a concurrent-mark-reset-for-overflow, this overflow will continue in the next concurrent mark phases. Eventually, it leads to the full GC since the concurrent mark seems no longer working.

We have four machines running the same Apache Storm based application with the same data traffic. Only one of the machines has this experience once in a week.

Is this related to the bug: ‘G1 does not expand marking stack when mark stack overflow happens during concurrent marking’ https://bugs.openjdk.java.net/browse/JDK-8065402

According to the suggestion from the above page, we doubled the concurrent mark threads from 4 to 8 and our heap size from 8GB to 16GB. However, the full GC still happens and the only difference is that the occurrences are delayed.

Any other suggestions?

Here's the GC log:

Java HotSpot(TM) 64-Bit Server VM (25.65-b01) for linux-amd64 JRE(1.8.0_65b17), 
built on Oct  6 2015 17:16:12 by "java_re" with gcc 4.3.0 20080428 (Red Hat 4.3.0-8) 
Memory: 4k page, physical 529167668k(69283408k free), swap 33554424k(33552380k free) 
CommandLine flags: -XX:ConcGCThreads=8 -XX:G1ReservePercent=20 -XX:GCLogFileSize=104857600 
-XX:InitialHeapSize=17179869184 -XX:InitiatingHeapOccupancyPercent=45 -XX:MaxGCPauseMillis=100 
-XX:MaxHeapSize=17179869184 -XX:NumberOfGCLogFiles=10 -XX:ParallelGCThreads=30 
-XX:+PrintAdaptiveSizePolicy -XX:PrintFLSStatistics=2 -XX:+PrintGC -XX:+PrintGCApplicationStoppedTime 
-XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC 
-XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseG1GC -XX:+UseGCLogFileRotation
...
...
2016-04-13T22:06:37.254-0400: 19839.175: [GC concurrent-root-region-scan-start]
2016-04-13T22:06:37.313-0400: 19839.234: [GC concurrent-root-region-scan-end, 0.0592966 secs]
2016-04-13T22:06:37.313-0400: 19839.234: [GC concurrent-mark-start]
2016-04-13T22:06:38.569-0400: 19840.490: [GC concurrent-mark-reset-for-overflow]
...
2016-04-13T22:06:42.810-0400: 19844.731: [GC concurrent-mark-reset-for-overflow]
...
2016-04-13T22:11:19.253-0400: 20121.175: [GC concurrent-mark-reset-for-overflow]
...
...
...
2016-04-14T01:58:17.254-0400: 33739.176: [GC concurrent-mark-reset-for-overflow]
...
2016-04-14T01:58:36.957-0400: 33758.878: [Full GC (Allocation Failure)
Gracegraceful answered 15/4, 2016 at 14:29 Comment(1)
Check this article:blogs.oracle.com/poonam/entry/understanding_g1_gc_logs :3.198: [GC concurrent-mark-reset-for-overflow] This indicates that the global marking stack had became full and there was an overflow of the stack. Concurrent marking detected this overflow and had to reset the data structures to start the marking againEighty
E
8

From oracle g1_gc blog:

GC concurrent-mark-reset-for-overflow : This indicates that the global marking stack had became full and there was an overflow of the stack. Concurrent marking detected this overflow and had to reset the data structures to start the marking again

So increasing -XX:MarkStackSize is one quick win.

Few observation from your VM parameters:

  1. The G1 GC is an adaptive garbage collector with defaults that enable it to work efficiently without modification. Have a quick look at oracle documentation page on G1GC
  2. Key parameters to set : -XX:MaxGCPauseMillis, -XX:G1HeapRegionSize,-XX:ParallelGCThreads=n, -XX:ConcGCThreads=n Leave everything else to default values.
  3. If your heap size is 16 GB, the ideal region size should be 8 MB. Make sure that you maintain 2048 regions.
  4. Revisit your pause time goal. -XX:MaxGCPauseMillis. If 200ms is unrealistic for 16 GB heap, set this value as properly.
  5. Official documentation page recommends the way to set XX:ParallelGCThreads=n, -XX:ConcGCThreads=n depending on number of cores in your machine.

    -XX:ParallelGCThreads=n: Sets the value of the STW worker threads. Sets the value of n to the number of logical processors. The value of n is the same as the number of logical processors up to a value of 8.

    -XX:ConcGCThreads=n:Sets the number of parallel marking threads. Sets n to approximately 1/4 of the number of parallel garbage collection threads (ParallelGCThreads).

  6. Revisit -XX:InitialHeapSize=17179869184 -XX:InitiatingHeapOccupancyPercent=45 -XX:G1ReservePercent=20 parameters. Leave them to default values unless you have pressing need to change them.

Visit this page for better understanding of G1GC logs.

Eighty answered 15/4, 2016 at 15:37 Comment(1)
A full GC following repeating concurrent-mark-reset-for-overflow's (the same issue as shown in the question) happened again on one of the four machines with the new addition setting -XX:MarkStackSize=16M. will make an update if the issue will be resolved after further increasing -XX:MarkStackSize.Gracegraceful

© 2022 - 2024 — McMap. All rights reserved.