If I use String.intern() to improve performance as I can use "==" to compare interned string, will I run into garbage collection issues? How does the garbage collection mechanism of interned strings differ from normal strings ?
In fact, this not a garbage collection optimisation, but rather a string pool optimization.
When you call String.intern()
, you replace reference to your initial String with its base reference (the reference of the first time this string was encountered, or this reference if it is not yet known).
Prior to Java 7 interned strings were allocated in PermGen space. This would become a garbage collector issue once your string is of no more use in application, since the interned string pool is a static member of the String class and will never be garbage collected. From Java 7 onward the interned strings are allocated on the Heap and are subject to garbage collection.
As a rule of thumb, i consider preferrable to never use this intern method and let the compiler use it only for constants Strings, those declared like this :
String myString = "a constant that will be interned";
This is better, in the sense it won't let you do the false assumption ==
could work when it won't.
Besides, the fact is String.equals
underlyingly calls ==
as an optimisation, making it sure interned strings optimization are used under the hood. This is one more evidence ==
should never be used on Strings.
== should never be used on Strings
nah, it can be used and it is even used in some places in JDK code, as they are many strings that are guaranteed to be interned, like field names. –
Bahamas String.intern()
manages an internal, native-implemented pool, which has some special GC-related features. This is old code, but if it were implemented anew, it would use a java.util.WeakHashMap
. Weak references are a way to keep a pointer to an object without preventing it from being collected. Just the right thing for a unifying pool such as interned strings.
That interned strings are garbage collected can be demonstrated with the following Java code:
public class InternedStringsAreCollected {
public static void main(String[] args)
{
for (int i = 0; i < 30; i ++) {
foo();
System.gc();
}
}
private static void foo()
{
char[] tc = new char[10];
for (int i = 0; i < tc.length; i ++)
tc[i] = (char)(i * 136757);
String s = new String(tc).intern();
System.out.println(System.identityHashCode(s));
}
}
This code creates 30 times the same string, interning it each time. Also, it uses System.identityHashCode()
to show what hash code Object.hashCode()
would have returned on that interned string. When run, this code prints out distinct integer values, which means that you do not get the same instance each time.
Anyway, usage of String.intern()
is somewhat discouraged. It is a shared static pool, which means that it easily turns into a bottleneck on multi-core systems. Use String.equals()
to compare strings, and you will live longer and happier.
String.intern()
on two strings which happen to have the same contents, then they must both obtain the same reference. This necessarily implies some sort of communication between the two cores. In practice, String.intern()
is implemented with a sort-of hashtable protected by a mutex, and each access (read or write) locks the mutex. There can be contention on that mutex, but most of the slowdown will be due to the necessity for the cores to synchronize their L1 caches (such synchronization is implied by the mutex locking, and is the expensive part). –
Armandinaarmando new String("a")
create a new instance each time. 2. .intern()
do a search in the string pool and found an instance with identical value(which is put into the string pool when you call .intern()
first time), and return the reference to the old instance. –
Aeneus "a"
string literal is interned, and is kept alive by its use as a string literal, so it never gets GC'ed. All the intern
calls return the same string object as the original "a"
. That's why this answer goes out of its way to construct a string from a char[]
instead of a string literal. –
Exmoor In fact, this not a garbage collection optimisation, but rather a string pool optimization.
When you call String.intern()
, you replace reference to your initial String with its base reference (the reference of the first time this string was encountered, or this reference if it is not yet known).
Prior to Java 7 interned strings were allocated in PermGen space. This would become a garbage collector issue once your string is of no more use in application, since the interned string pool is a static member of the String class and will never be garbage collected. From Java 7 onward the interned strings are allocated on the Heap and are subject to garbage collection.
As a rule of thumb, i consider preferrable to never use this intern method and let the compiler use it only for constants Strings, those declared like this :
String myString = "a constant that will be interned";
This is better, in the sense it won't let you do the false assumption ==
could work when it won't.
Besides, the fact is String.equals
underlyingly calls ==
as an optimisation, making it sure interned strings optimization are used under the hood. This is one more evidence ==
should never be used on Strings.
== should never be used on Strings
nah, it can be used and it is even used in some places in JDK code, as they are many strings that are guaranteed to be interned, like field names. –
Bahamas This article provides the full answer.
In java 6 the string pool resides in the PermGen, since java 7 the string pool resides in the heap memory.
Manually interned strings will be garbage-collected.
String literals will be only garbage collected if the class that defines them is unloaded.
The string pool is a HashMap with fixed size which was small in java 6 and early versions of java 7, but increased to 60013 since java 7u40.
It can be changed with -XX:StringTableSize=<new size> and viewed with -XX:+PrintFlagsFinal java options.
Please read: http://satukubik.com/2009/01/06/java-tips-memory-optimization-for-string/
The conclusion I can get from your information is: You interned too many String. If you really need to intern so many String for performance optimization, increase the perm gen memory, but if I were you, I will check first if I really need so many interned String.
© 2022 - 2024 — McMap. All rights reserved.