Short-lived java applications: How to tune G1 to kick in later?
Asked Answered
C

1

8

I've got short-living applications which usually (but not always) do not need any GC (fits in heap, epsilon GC proves this by not causing an OOM).

Interestingly, G1 still kicks in very early even though there's still plenty of heap free:

[0.868s][info   ][gc,start     ] GC(0) Pause Young (Normal) (G1 Evacuation Pause)
[0.869s][info   ][gc,task      ] GC(0) Using 13 workers of 13 for evacuation
[0.872s][info   ][gc,phases    ] GC(0)   Pre Evacuate Collection Set: 0.0ms
[0.873s][info   ][gc,phases    ] GC(0)   Evacuate Collection Set: 2.8ms
[0.873s][info   ][gc,phases    ] GC(0)   Post Evacuate Collection Set: 0.4ms
[0.873s][info   ][gc,phases    ] GC(0)   Other: 1.0ms
[0.873s][info   ][gc,heap      ] GC(0) Eden regions: 51->0(45)
[0.873s][info   ][gc,heap      ] GC(0) Survivor regions: 0->7(7)
[0.873s][info   ][gc,heap      ] GC(0) Old regions: 0->2
[0.873s][info   ][gc,heap      ] GC(0) Humongous regions: 4->2
[0.873s][info   ][gc,metaspace ] GC(0) Metaspace: 15608K->15608K(1062912K)
[0.874s][info   ][gc           ] GC(0) Pause Young (Normal) (G1 Evacuation Pause) 55M->10M(1024M) 5.582ms
[0.874s][info   ][gc,cpu       ] GC(0) User=0.00s Sys=0.00s Real=0.01s
[...]

It makes me wonder why GC runs here at all as heap is only 55MB.
In total I have usually 10-15 GC runs which aggregate to a consumed user cpu time of ~1 second which I'd like to avoid.

JVM: openjdk version "11.0.16" 2022-07-19
JVM ARGS: -Xms1g -Xmx2g -XX:+PrintGCDetails -Xlog:gc+cpu=info -Xlog:gc+heap+exit 

Question:
How can I tune G1 (jdk 11) to kick in as late as possible (e.g. when heap/eden is 90% full) to ideally avoid any GC pauses/runs in most of my cases?
Increasing -XX:InitiatingHeapOccupancyPercent (e.g. to 90%) did not help in my case.


EDIT:

Try it out by yourself by executing this java class on your jvm:

public class GCTest {
    public static void main(String[] args) {

        java.util.Map<String,byte[]> map = new java.util.HashMap<>();
        
        for(int i=0;i<1_000_000;i++)
            map.put(i+"", new byte[i % 256]);   
        
        System.out.println(map.size());
    }
}

This application consumes about 260MB heap and runs less than 500ms.
When started with the following jvm arguments:
-Xms1g -Xmx2g -XX:+PrintGCDetails -Xlog:gc+cpu=info -Xlog:gc+heap+exit
you will get ~5-6 GC runs (tested with java 11+16 hotspot vm) .
GC Epsilon tests clearly shows that it can run without any GCing.

Challenge:
Can you find jvm arguments which will force G1 to not do any GCing here?

Creolacreole answered 12/8, 2022 at 17:12 Comment(16)
JDK 11 comes with Epsilon GC, which is a collector that does nothing. If you're sure your app won't go over, try that, maybe.Paulettapaulette
@M.Prokhorov I did that (see first sentence). However, as mentioned there, it will not always fit so I need a GC but one which runs very late (when heap is nearly full).Creolacreole
Yes, sorry, I missed that part. Nevermind, then.Paulettapaulette
If your application is short-lived, you might want to try GraalVM's native-image/AOT compilation. Aside from that, you might want to use -client.Highclass
Did you experience a partial garbage collection (where only the young generation was cleaned up) or a full garbage collection? If it's the former, you might want to increase the size of the young generation using -XX:+UnlockExperimentalVMOptions -XX:G1NewSizePercent=45 or similar.Highclass
@Highclass graalVM is currently not an option (reflection is used). I played with -XX:G1NewSizePercent as well but this did not help either.Creolacreole
Have you tried running OpenJ9 as an alternative and is this feasible for you? It has some other GCs that may be better for short-lived applications.Highclass
Why do you focus on G1? Its purpose is kind of opposite to what you're trying to achieve: G1 targets short frequent collections. It sounds like Parallel GC will fit your requirements better.Verde
@Verde You're right, I also thought that the ParallelGC would fit even better to my needs. Unfortunately, I have also GC runs with the ParallelGC and even more interesting, the -Xmn argument which Turac proposed for the G1 and which solves the challenge, doesn't stop ParallelGC from doing GC runs. Question to you: Do you have vm arguments using the ParallelGC which causes GC to not run when executing the above test application?Creolacreole
The idea is the same: the New Generation should be large enough to accommodate all allocated objects. With Parallel GC, the size of New Generation includes two survivor spaces. So either increase -Xmn or decrease the size of survivor spaces. I checked that the following arguments work: -XX:+UseParallelGC -Xmx500m -Xmn300m -XX:SurvivorRatio=16Verde
@Verde With your settings I still get 2 GC runs (hotspot vm 11 and vm 16) but when setting -Xmx2g -Xmn1g, it works with the ParallelGC. However, when increasing the iteration count from 1_000_000 to 3_000_000 (which still can run with an epsilon GC) I was unable to find settings for ParallelGC to avoid any GC runs while G1 will surprisingly still not run. Do you have an idea?Creolacreole
What are GC logs? I see no GC after 3M iterations with the following arguments: java -Xlog:gc -XX:+UseParallelGC -Xmx2g -Xms2g -Xmn1g GCTestVerde
@Verde I get [0.800s][info][gc] GC(0) Pause Young (Allocation Failure) 768M->525M(1920M) 149.919ms (3M iterations) on Windows. Just tried it on Linux and there is indeed no GC when using the same/your settings - interesting!Creolacreole
I guess, either -XX:+UseCompressedOops or -XX:+CompactStrings is off. Otherwise the total allocated memory should not have reached 768M.Verde
Nope, both are onCreolacreole
apangin is right and his arguments is working without GC pause. Of course Parallel Collector will be better choice, but question was related especially G1(I was thought as customer restriction).Ascendant
A
2

You can't escape from GC but we can stay away for a while. The most important question is when GC trigger?

When Eden size is full, GC will trigger for minor GC. That means if we set bigger Young Generation size, GC will not trigger because Young Generation is not full yet. So another question is coming. How can we set young collection size ?

JVM has -Xmn argument. This argument sets the initial and maximum size of the heap for the young generation. Initial is important keyword in this explanation. In Oracle Doc:

-Xmn size Sets the initial and maximum size (in bytes) of the heap for the young generation (nursery) in the generational collectors. Append the letter k or K to indicate kilobytes, m or M to indicate megabytes, or g or G to indicate gigabytes. The young generation region of the heap is used for new objects. GC is performed in this region more often than in other regions. Instead of the -Xmn option to set both the initial and maximum size of the heap for the young generation, you can use -XX:NewSize to set the initial size and -XX:MaxNewSize to set the maximum size.

So when apply -Xmn argument like -Xmn300m -Xmx512m -XX:+PrintGCDetails -Xlog:gc+cpu=info -Xlog:gc+heap+exit we will not see GC pause. I tried for your example. Result:

[0.002s][warning][gc] -XX:+PrintGCDetails is deprecated. Will use -Xlog:gc* instead.
[0.007s][info   ][gc,heap] Heap region size: 1M
[0.015s][info   ][gc     ] Using G1
[0.015s][info   ][gc,heap,coops] Heap address: 0x00000000e0000000, size: 512 MB, Compressed Oops mode: 32-bit
1000000
[0.321s][info   ][gc,heap,exit ] Heap
[0.321s][info   ][gc,heap,exit ]  garbage-first heap   total 522240K, used 245760K [0x00000000e0000000, 0x0000000100000000)
[0.321s][info   ][gc,heap,exit ]   region size 1024K, 221 young (226304K), 0 survivors (0K)
[0.321s][info   ][gc,heap,exit ]  Metaspace       used 6722K, capacity 6815K, committed 7040K, reserved 1056768K
[0.321s][info   ][gc,heap,exit ]   class space    used 596K, capacity 613K, committed 640K, reserved 1048576K

**Don't forget, if application trigger the GC(create object more than young collection size) with these arguments, you can see latency problem because GC can take a long time to complete

Ascendant answered 13/8, 2022 at 20:37 Comment(1)
This is the correct answer. Reason for GC get invoked multiple times is that JVM starts with a low initial size of memory and each time it fills up, it invokes GC and then increases the memory as well. If you from the get go declare the initial memory needed it will not fill it up during execution so no GC is needed and also no extension of heap memory is needed as wellCompaction

© 2022 - 2024 — McMap. All rights reserved.