Why doesn't G1 start a marking cycle when the InitiatingHeapOccupancyPercent is achieved?
Asked Answered
D

1

14

According to the documentation, XX:InitiatingHeapOccupancyPercent

Sets the Java heap occupancy threshold that triggers a marking cycle. The default occupancy is 45 percent of the entire Java heap.

In my current environment, that does not happen.

My G1 garbage collection configuration is as follows

-Xms25000m
-Xmx25000m
-XX:+UseG1GC
-XX:MaxGCPauseMillis=1000
-XX:GCTimeRatio=99
-XX:InitiatingHeapOccupancyPercent=70
-XX:MaxTenuringThreshold=8
-XX:+UnlockExperimentalVMOptions
-XX:G1MixedGCCountTarget=16
-XX:G1OldCSetRegionThresholdPercent=3
-XX:G1NewSizePercent=30
-XX:G1RSetUpdatingPauseTimePercent=5

With a 25g heap and an XX:InitiatingHeapOccupancyPercent of 70%, you would expect a marking cycle to begin when 18g is occupied. I'm tailing the garbage collection logs and that doesn't happen.

Here's an extract:

{Heap before GC invocations=592 (full 0):
 garbage-first heap   total 25600000K, used 22802164K [0x00000001a5800000, 0x00000001a60061a8, 0x00000007c0000000)
  region size 8192K, 1526 young (12500992K), 25 survivors (204800K)
 Metaspace       used 37386K, capacity 37948K, committed 38144K, reserved 1083392K
  class space    used 3948K, capacity 4080K, committed 4096K, reserved 1048576K
2016-04-20T22:06:38.272+0000: 4213.406: [GC pause (GCLocker Initiated GC) (young)
Desired survivor size 801112064 bytes, new threshold 8 (max 8)
- age   1:   98537800 bytes,   98537800 total
- age   2:    7053912 bytes,  105591712 total
- age   3:    6556320 bytes,  112148032 total
- age   4:    8836064 bytes,  120984096 total
- age   5:    5725448 bytes,  126709544 total
- age   6:    6702728 bytes,  133412272 total
- age   7:    3831920 bytes,  137244192 total
- age   8:    4166336 bytes,  141410528 total
 4213.406: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 184844, predicted base time: 44.67 ms, remaining time: 955.33 ms, target pause time: 1000.00 ms]
 4213.406: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 1501 regions, survivors: 25 regions, predicted young region time: 21.21 ms]
 4213.406: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 1501 regions, survivors: 25 regions, old: 0 regions, predicted pause time: 65.88 ms, target pause time: 1000.00 ms]
 4213.475: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: recent GC overhead higher than threshold after GC, recent GC overhead: 1.40 %, threshold: 1.00 %, uncommitted: 0 bytes, calculated expansion amount: 0 bytes (20.00 %)]
, 0.0687163 secs]
   [Parallel Time: 61.7 ms, GC Workers: 28]
      [GC Worker Start (ms): Min: 4213406.9, Avg: 4213407.1, Max: 4213407.3, Diff: 0.4]
      [Ext Root Scanning (ms): Min: 6.0, Avg: 6.2, Max: 6.4, Diff: 0.4, Sum: 173.1]
      [Update RS (ms): Min: 33.5, Avg: 34.0, Max: 34.6, Diff: 1.1, Sum: 951.9]
         [Processed Buffers: Min: 27, Avg: 36.6, Max: 48, Diff: 21, Sum: 1024]
      [Scan RS (ms): Min: 0.1, Avg: 0.2, Max: 0.5, Diff: 0.4, Sum: 6.3]
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
      [Object Copy (ms): Min: 20.1, Avg: 20.6, Max: 20.8, Diff: 0.7, Sum: 577.5]
      [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.7]
         [Termination Attempts: Min: 1, Avg: 13.2, Max: 19, Diff: 18, Sum: 371]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff: 0.3, Sum: 4.7]
      [GC Worker Total (ms): Min: 60.9, Avg: 61.2, Max: 61.6, Diff: 0.6, Sum: 1714.2]
      [GC Worker End (ms): Min: 4213468.2, Avg: 4213468.3, Max: 4213468.5, Diff: 0.3]
   [Code Root Fixup: 0.4 ms]
   [Code Root Purge: 0.0 ms]
   [Clear CT: 1.2 ms]
   [Other: 5.4 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 0.5 ms]
      [Ref Enq: 0.0 ms]
      [Redirty Cards: 0.8 ms]
      [Humongous Register: 0.2 ms]
      [Humongous Reclaim: 0.0 ms]
      [Free CSet: 2.4 ms]
   [Eden: 11.7G(11.7G)->0.0B(11.7G) Survivors: 200.0M->200.0M Heap: 21.7G(24.4G)->10.0G(24.4G)]
Heap after GC invocations=593 (full 0):
 garbage-first heap   total 25600000K, used 10516798K [0x00000001a5800000, 0x00000001a60061a8, 0x00000007c0000000)
  region size 8192K, 25 young (204800K), 25 survivors (204800K)
 Metaspace       used 37386K, capacity 37948K, committed 38144K, reserved 1083392K
  class space    used 3948K, capacity 4080K, committed 4096K, reserved 1048576K
}
 [Times: user=1.70 sys=0.01, real=0.07 secs] 
2016-04-20T22:06:38.342+0000: 4213.475: Total time for which application threads were stopped: 0.0701353 seconds, Stopping threads took: 0.0001600 seconds

I'll draw your attention to

{Heap before GC invocations=592 (full 0):
   garbage-first heap   total 25600000K, used 22802164K [0x00000001a5800000, 0x00000001a60061a8, 0x00000007c0000000)
[...]
[Eden: 11.7G(11.7G)->0.0B(11.7G) Survivors: 200.0M->200.0M Heap: 21.7G(24.4G)->10.0G(24.4G)]

More than 70% of the heap is occupied before this collection. Why did that not trigger a marking cycle?

The application continues doing young generation collections, filling up the old regions, eventually leading to an allocation failure and a lengthy Full GC.


Reducing InitiatingHeapOccupancyPercent to 55 had no discernible effect.

At 20, it did start doing mixed collections, but only when ~80% of the heap was occupied.

Dibbell answered 20/4, 2016 at 22:12 Comment(3)
It's exceeding the specified GCTimeRatio, try relaxing that and see if it makes a difference.Cartilaginous
@Cartilaginous So the theory is that it doesn't think it can satisfy through put so it doesn't even try? I'll get back with the results soon.Dibbell
@Cartilaginous It didn't with a GCTimeRatio of 24. Note that this was one of many samples from the log. It's mostly able to keep up with 99. Also, please see my edit.Dibbell
C
16

JDK-6976060 suggests that the need for a marking cycle is calculated at the end of a young GC. Depending on whether it uses the occupancy stats before or after the young GC this may or may not mean that eden space is always considered as 0% occupied for the purpose of IHOP calculation. With a 45% eden size that would mean 70% occupancy could never be reached, the maximum possible occupancy would be 55% and at that point the heap would be completely full and it would be too late for a mixed collection.

But I am doubtful whether this is truly the case because in the face of dynamic young generation sizing this would make the documentation misleading and IHOP tuning far more difficult. It should be fairly easy to verify this with an artificial test-case and manually sized generations.

If that's not the issue then GC pause (GCLocker Initiated GC) (young) might point to bug 8140597, which is fixed in jdk9b94.


Update: The description in Bug 8151176 indeed suggests that for the purpose of the IHO-percentage calculation it calculates oldgen occupancy/overall heap size. Which means young generation occupancy is completely ignored, which in turn means if young gen fraction > IHOP then it can never initiate a concurrent cycle.

The reason is that static IHOP starts if old gen occupancy exceeds a fixed percentage of the current heap capacity. If either the user or ergonomics decide that the old gen can not be larger than that fraction of the heap capacity that triggers concurrent mark, marking will never start.

So currently available solutions are

  • constrain the young gen fraction < IHOP
  • decrease IHOP to take the minimum possible old gen fraction into account
  • let the JVM adjust the IHOP dynamically

Update2: The latest comment on that bug indicates that this has been fixed for some time, so this answer should be considered as historic.

Cartilaginous answered 21/4, 2016 at 16:48 Comment(8)
Concerning your first paragraph. Whether it considers both generations or not seems unrelated since it never starts a marking cycle. It keeps young gc'ing until there's no more space left. As for the bug, I'll try to upgrade to 9 and see what happens.Dibbell
@Pillar, sorry, bad wording. I didn't mean that eden is excluded from overall accounting, but instead always accounted as 0%, because it's empty after a young collection. Which would mean it can't reach 70% occupancy if your eden is already 45% of the heap. But that were true that would be a fairly troublesome design precisely for the issues you're experiencing, so i'm far from certain that that's really the issue.Cartilaginous
What you're saying makes a lot of sense. I will try this out next. I think I've gone up to 35% and it's worked. (Don't want to play in production right now, will do it at work tomorrow.) Thanks for the help.Dibbell
At first glance, you are correct. With N=40%, IHOP=50%, marking cycle started at 90% full occupancy. With N=30%, IHOP=40%, marking cycle started at 70%. Going to try a few more combinations.Dibbell
Yeah, any combination I run, only the heap after young GC is used in the calculation. For example, after GC, total 20480000K, used 8366146K. That's about 40%. Since my IHOP is also at 40%, a concurrent-mark-start follows it. Do you think this merits opening a bug? I'll try to create an MCVE for them.Dibbell
See JDK-8142484 and others linked there. They've been discussing this.Dibbell
Though, as I read it, it seems they talk about old gen occupancy. In my tests, it's total occupancy, ignoring young gen.Dibbell
Well, that would clearly be a documentation error since the docs say (as you quoted above) "The default occupancy is 45 percent of the entire Java heap.". Bug 8151176 clearly contradicts that.Cartilaginous

© 2022 - 2024 — McMap. All rights reserved.