Is String Deduplication feature of the G1 garbage collector enabled by default?
Asked Answered
S

2

16

JEP 192: String Deduplication in G1 implemented in Java 8 Update 20 added the new String deduplication feature:

Reduce the Java heap live-data set by enhancing the G1 garbage collector so that duplicate instances of String are automatically and continuously deduplicated.

The JEP page mentions that a command-line option UseStringDeduplication (bool) allows the dedup feature to be enabled or disabled. But the JEP page does not go so far as to indicate the default.

➠ Is the dedup feature ON or OFF by default in the G1 garbage collector bundled with Java 8 and with Java 9?

➠ Is there a “getter” method to verify the current setting at runtime?

I do not know where to look for documentation beyond the JEP page.

In at least the HotSpot-equipped implementations of Java 9, the G1 garbage collector is enabled by default. That fact prompted this Question now. For more info on String interning and deduplication, see this 2014-10 presentation by Aleksey Shipilev at 29:00.

Slight answered 3/10, 2017 at 4:51 Comment(1)
It is not enabled by default in Java 8. You have to use -XX:+UseStringDeduplication flag. Note that this won't work if you are not using G1 GC.Elgin
P
22

String deduplication off by default

For the versions of Java 8 and Java 9 seen below, UseStringDeduplication is false (disabled) by default.

One way to verify the feature setting: list out all the final flags for JVM and then look for it.

build 1.8.0_131-b11

    $ java -XX:+UseG1GC  -XX:+UnlockDiagnosticVMOptions -XX:+PrintFlagsFinal -version | grep -i 'duplicat'
     bool PrintStringDeduplicationStatistics        = false                               {product}
    uintx StringDeduplicationAgeThreshold           = 3                                   {product}
     bool StringDeduplicationRehashALot             = false                               {diagnostic}
     bool StringDeduplicationResizeALot             = false                               {diagnostic}
     bool UseStringDeduplication                    = false                               {product}
java version "1.8.0_131"
Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)

build 9+18

    $ java -XX:+UseG1GC  -XX:+UnlockDiagnosticVMOptions -XX:+PrintFlagsFinal -version | grep -i 'duplicat'
    uintx StringDeduplicationAgeThreshold          = 3                                        {product} {default}
     bool StringDeduplicationRehashALot            = false                                 {diagnostic} {default}
     bool StringDeduplicationResizeALot            = false                                 {diagnostic} {default}
     bool UseStringDeduplication                   = false                                    {product} {default}
java version "9"
Java(TM) SE Runtime Environment (build 9+181)
Java HotSpot(TM) 64-Bit Server VM (build 9+181, mixed mode)

Another way to test it is with

package jvm;

import java.util.ArrayList;
import java.util.List;

public class StringDeDuplicationTester {

    public static void main(String[] args) throws Exception {
        List<String> strings = new ArrayList<>();
        while (true) {
            for (int i = 0; i < 100_00; i++) {
                strings.add(new String("String " + i));
            }
            Thread.sleep(100);
        }
    }
}

run without explicitly specifying it.

$ java  -Xmx256m -XX:+UseG1GC -XX:+PrintStringDeduplicationStatistics jvm.StringDeDuplicationTester
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at jvm.StringDeDuplicationTester.main(StringDeDuplicationTester.java:12)

Run with explicitly turning it ON.

$ java  -Xmx256m -XX:+UseG1GC -XX:+UseStringDeduplication -XX:+PrintStringDeduplicationStatistics jvm.StringDeDuplicationTester
[GC concurrent-string-deduplication, 5116.7K->408.7K(4708.0K), avg 92.0%, 0.0246084 secs]
   [Last Exec: 0.0246084 secs, Idle: 1.7075173 secs, Blocked: 0/0.0000000 secs]
      [Inspected:          130568]
         [Skipped:              0(  0.0%)]
         [Hashed:          130450( 99.9%)]
         [Known:                0(  0.0%)]
         [New:             130568(100.0%)   5116.7K]
      [Deduplicated:       120388( 92.2%)   4708.0K( 92.0%)]
         [Young:                0(  0.0%)      0.0B(  0.0%)]
         [Old:             120388(100.0%)   4708.0K(100.0%)]
   [Total Exec: 1/0.0246084 secs, Idle: 1/1.7075173 secs, Blocked: 0/0.0000000 secs]
      [Inspected:          130568]
         [Skipped:              0(  0.0%)]
         [Hashed:          130450( 99.9%)]
         [Known:                0(  0.0%)]
         [New:             130568(100.0%)   5116.7K]
      [Deduplicated:       120388( 92.2%)   4708.0K( 92.0%)]
         [Young:                0(  0.0%)      0.0B(  0.0%)]
         [Old:             120388(100.0%)   4708.0K(100.0%)]
   [Table]
      [Memory Usage: 264.9K]
      [Size: 1024, Min: 1024, Max: 16777216]
      [Entries: 10962, Load: 1070.5%, Cached: 0, Added: 10962, Removed: 0]
      [Resize Count: 0, Shrink Threshold: 682(66.7%), Grow Threshold: 2048(200.0%)]
      [Rehash Count: 0, Rehash Threshold: 120, Hash Seed: 0x0]
      [Age Threshold: 3]
   [Queue]
      [Dropped: 0]
[GC concurrent-string-deduplication, deleted 0 entries, 0.0000008 secs]
...
output truncated

Note: this output is from build 1.8.0_131-b11. Looks like Java 9 has no option to print String de-duplication statistics. Potential bug ? No. Unified logging killed this specific option.

$ java  -Xmx256m -XX:+UseG1GC -XX:+PrintStringDeduplicationStatistics -version
Unrecognized VM option 'PrintStringDeduplicationStatistics'
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
Petrifaction answered 3/10, 2017 at 5:6 Comment(4)
For the bug statement, probably the flag is legacy now and needs to be Xlogged instead. Have addressed that in my answer.Evoy
Thanks for the notice that statistics flag has not working in java9Lynseylynus
But why is it off by default?Lenni
OFF by default in jdk 11 as well.Dipnoan
E
15

Though Jigar has precisely provided the way to get to know the JVM flags and stats, yet to link to some useful documents addressing this part of the question:

I do not know where to look for documentation beyond the JEP page.

In JDK 9, the default garbage collector is G1 when a garbage collector is not explicitly specified.

  • The java tool which details the usage of the flag

    -XX:+UseStringDeduplication
    

Enables string deduplication. By default, this option is disabled. To use this option, you must enable the garbage-first (G1) garbage collector.

String deduplication reduces the memory footprint of String objects on the Java heap by taking advantage of the fact that many String objects are identical. Instead of each String object pointing to its own character array, identical String objects can point to and share the same character array.


Also addressing the open question there if

Java 9 has no option to print String de-duplication statistics.

With JEP 158:Unified JVM Logging implementation in Java9, the garbage collector flags are marked as legacy and alternate way of tracing them is using -Xlog feature. A detailed list of the replacement for converting GC Logging Flags to Xlog is listed here. One of which suggests replacing

PrintStringDeduplicationStatistics  =>   -Xlog:stringdedup*=debug
Evoy answered 3/10, 2017 at 5:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.