The return of String.intern() explained
Asked Answered
B

3

20

Consider:

String s1 = new StringBuilder("Cattie").append(" & Doggie").toString();
System.out.println(s1.intern() == s1); // true why?
System.out.println(s1 == "Cattie & Doggie"); // true another why?

String s2 = new StringBuilder("ja").append("va").toString();
System.out.println(s2.intern() == s2); // false

String s3 = new String("Cattie & Doggie");
System.out.println(s3.intern() == s3); // false
System.out.println(s3 == "Cattie & Doggie"); // false

I got confused why they are resulting differently by the returned value of String.intern() which says:

When the intern method is invoked, if the pool already contains a string equal to this String object as determined by the equals(Object) method, then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned.

Especially after these two tests:

assertFalse("new String() should create a new instance", new String("jav") == "jav");
assertFalse("new StringBuilder() should create a new instance",
    new StringBuilder("jav").toString() == "jav");

I once read a post talking about some special strings interned before everything else, but it's a real blur now.

If there are some strings pre-interned, is there a way to get kind of a list of them? I am just curious about what they can be.


Updated

Thanks to the help of @Eran and @Slaw, I finally can explain what just happened there for the output

true
true
false
false
false
  1. Since "Cattie & Doggie" doesn't exist in the pool, s1.intern() will put the current object reference to the pool and return itself, so s1.intern() == s1;
  2. "Cattie & Doggie" already in the pool now, so string literal "Cattie & Doggie" will just use the reference in pool which is actually s1, so again we have true;
  3. new StringBuilder().toString() will create a new instance while "java" is already in the pool and then the reference in pool will be returned when calling s2.intern(), so s2.intern() != s2 and we have false;
  4. new String() will also return a new instance, but when we try to s3.intern(), it will return the previously stored reference in the pool which is actualy s1 so s3.intern() != s3 and we have false;
  5. As #2 already discussed, String literal "Cattie & Doggie" will return the reference already stored in the pool (which is actually s1), so s3 != "Cattie & Doggie" and we have false again.

Thanks for @Sunny to provide a trick to get all the interned strings.

Bartram answered 13/3, 2019 at 6:50 Comment(0)
D
29

s2.intern() would return the instance referenced by s2 only if the String pool didn't contain a String whose value is "java" prior to that call. The JDK classes intern some Strings before your code is executed. "java" must be one of them. Therefore, s2.intern() returns the previously interned instance instead of s2.

On the other hand, the JDK classes did not intern any String whose value is equal to "Cattie & Doggie", so s1.intern() returns s1.

I am not aware of any list of pre-interned Strings. Such a list will most likely be considered an implementation detail, which may vary on different JDK implementations and JDK versions, and should not be relied on.

Declinate answered 13/3, 2019 at 6:54 Comment(13)
thank you for the detailed explanation, so is it correct to say: s.intern() will return the original reference if the string is not interned but if it's already interned (in the constant pool) then it returns the reference in the constant pool?Bartram
@Bartram that's true, as the javadoc says - "When the intern method is invoked, if the pool already contains a string equal to this String object as determined bythe equals(Object) method, then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned. "Declinate
sorry to disrupt you again. If that's so then why the second why still returns a true? ... so confusing now... but when I replaced the new StringBuilder().toString() with new String() both of them will become false. So weird...Bartram
I updated my question, there is an another why. Sorry for the inconvenience, please check the updateBartram
@Bartram Oh, I see now - System.out.println(s1 == "Cattie & Doggie"); returns true because s1 contains the interned instance equal to "Cattie & Doggie". (due to the previous s1.intern() call). The String literal "Cattie & Doggie" does not cause a new String to be created if an equal String is already present in the pool.Declinate
but there is no explicit assignment as s1 = s1.intern() shouldn't s1 is still the reference pointing to the new StringBuilder() created instance?Bartram
@Bartram s1.intern() == s1 is already true (since s1.intern() added the instance referenced by s1 to the pool), so there's no need to assign s1.intern() to s1.Declinate
Let us continue this discussion in chat.Bartram
@Bartram "Otherwise, this String object is added to the pool and a reference to this String object is returned". When you called s1.intern() the s1 instance was put into the pool. Then you later use a string literal with the same value as s1, which means it uses that instance from the pool.Postaxial
@Postaxial Hah! I got your point now, I will update the question to draw a summary for it. Thank you Slaw! Thank you EranBartram
"any list of pre-interned Strings" would depend on which JDK classes your program happened to load as well.Grouper
@AlexeyRomanov and the launcher, e.g. the commonly used standard launcher loads the specfied main class and does a getMethod("main", String[].class) on it, thus “pre-interning” the string "main". A different launcher, e.g. a native launcher invoking the main method via JNI would behave differently. Likewise, the way command line options are processed may differ and hence, have different effect on the list of “pre-interned” strings.Lodi
I'm kind of surprised that the literal "Cattie & Doggie" isn't interned as soon as the class is loaded. I guess I need to brush up on my JVM internals again.Saffian
D
3

When the intern() method is invoked on a String object it looks the string contained by this String object in the pool, if the string is found there then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned.

So java string must already be in the pool. hence it is giving false.

You can print all strings in pool

How to print the whole String pool?

Here is an example to get all string if you are using openjdk.

Dinnage answered 13/3, 2019 at 7:1 Comment(1)
I just tried with the example in github you enclosed, it seems not working though I added the dependency it requires $JAVA_HOME/lib/sa-jdi.jar. As for the OS link, How to print the whole String pool?, it's not tested yet but looks so tricky. Thanks for the help :)Bartram
R
0

String literals (those that are hardcoded like "a string") are already interned for you by the compiler. But those strings that are acquired programmatically are not, and will be interned only if you use .intern() method.

Usually you don't intern strings manually, unless you know you will store in memory a large number of repeating strings, so you can save a lot of memory that way.

That is explained here: What is Java String interning?

Resilient answered 13/3, 2019 at 7:5 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.