Why intern() does not work with literal 'java'?
Asked Answered
C

4

12

I have tried below code:

public class TestIntern {
  public static void main(String[] args) {
   char[] c1={'a','b','h','i'};
   String s1 = new String(c1);
   s1.intern();
   String s2="abhi";
   System.out.println(s1==s2);//true

   char[] c2={'j','a','v','a'};
   String sj1 = new String(c2);
   sj1.intern();
   String sj2="java";
   System.out.println(sj1==sj2);//false

   char[] c3={'J','A','V','A'};
   String tj1 = new String(c3);
   tj1.intern();
   String tj2="JAVA";
   System.out.println(tj1==tj2);//true
  }
}

I have tried many different literals.

Could anyone please explain why intern() doesn't work as expected with literal "java"? Why do the above reference comparisons evaluate to true, except when the literal is "java"?

Czar answered 27/3, 2018 at 21:7 Comment(13)
tried sj.intern() == sc.intern() it is supposed to return trueUdell
You don't compare strings with ==Quoits
You are ignoring the return value of intern(). Re-read the docs, assign the return value to your original reference, and you'll see it works as you expect. Also, please do not ever depend on intern().Kimura
String.intern is not a void method, it returns a String. You are ignoring the return value. Read the documentation. Do not ignore return values.Songwriter
Looks like a typo to me; you just didn't reassign s1 or sj or sj1 anywhere.Conscious
@Quoits The OP wants to compare references in this case.Endodontics
@MarcosVasconcelos yes, it is supposed. But it is not returning true. I am also surprised.Czar
The answers and comments until now are, as far as I can tell, missing the point of the question. At the very least, they don't nearly explain the observed behavior, which is as described, and very surprising (for me, at least)Schnook
@MDSayemAhmed That's because "ABCD" is always the reference from the first invokation, while subsequent new String() always return a new object.Kimura
Well, people are voting to close this question, as being "...caused by a problem that can no longer be reproduced or a simple typographical error". These people may simply be not nerdy enough for this sort of question :-)Schnook
@AbhishekKumar Petr Janeček essentially gave the answer, and I just "confirmed" it, as far as reasonably possible. (I'm just mentioning this for the case that you accidentally accepted my answer, but it's up to you, of course...)Schnook
I think the people voting to close just missed the fact that the ignored return value is actually part of the question and not just a mistake.Vaules
I know this is definitely a dupe, but it's pretty hard to search for it...Cigarillo
P
17

When the JVM first encounters the new String(new char[] {'a', 'b', 'h', 'i'}) string and you call intern() on it, the reference you just created becomes the canonical one and is stored in the string constant pool. Then "abhi" is pulled out from the constant pool - your canonical instance has been reused.

Your problem is that the literal "java" exists in the constant string pool before the start of your program - the JVM simply has it there for some use. Therefore, calling intern() on new String(new char[] {'j', 'a', 'v', 'a'}) does not intern your reference. Instead, it returns the pre-existing canonical value from the constant pool, and you happily ignore the return value.

You should not ignore the return value, but use it. You never know whether your "definitely original" string has not been living in the constant pool since the start of the JVM. Anyway, all of this is implementation dependent, you should either always use the references returned by the intern() method, or never. Do not mix between them.

Phares answered 27/3, 2018 at 21:20 Comment(4)
Then how come this program prints true on the first line, but false for the rest: ideone.com/SQiiRf? Not only that, for all subsequent calls to callMethod it will print false.Groyne
@MDSayemAhmed because new String() always creates a new reference, it never reaches to the constant pool. The first invocation is the one that gets stored in the constant pool, the other invokations create different objects that are being ignored by the intern() method. On the other hand, a "java" literal will always return the same object, as far as I know. Uh, is that in the spec, though? I'm honestly not sure.Kimura
Yeah, "java" will always return the same interned object, that I can confirm. Thank you for explaining with patience.Groyne
@PetrJaneček Your line:"the literal "java" exists in the constant string pool before the start of your program - the JVM simply has it there for some use" really makes a sense. I tried by re-assigning like "sj=sj.intern()" for 'java' and it worked perfectly fine. Thank you.Czar
S
3

The answer by Petr Janeček is almost certainly correct (+1 there).

Really proving it is hard, because much of the string pool resides in the JVM itself, and one could hardly access it without a tweaked VM.

But here is some more evidence:

public class TestInternEx
{
    public static void main(String[] args)
    {
        char[] c1 = { 'a', 'b', 'h', 'i' };
        String s1 = new String(c1);
        String s1i = s1.intern();
        String s1s = "abhi";
        System.out.println(System.identityHashCode(s1));
        System.out.println(System.identityHashCode(s1i));
        System.out.println(System.identityHashCode(s1s));
        System.out.println(s1 == s1s);// true

        char[] cj =
        { 'j', 'a', 'v', 'a' };
        String sj = new String(cj);
        String sji = sj.intern();
        String sjs = "java";
        System.out.println(System.identityHashCode(sj));
        System.out.println(System.identityHashCode(sji));
        System.out.println(System.identityHashCode(sjs));
        System.out.println(sj == sjs);// false

        char[] Cj = { 'J', 'A', 'V', 'A' };
        String Sj = new String(Cj);
        String Sji = Sj.intern();
        String Sjs = "JAVA";
        System.out.println(System.identityHashCode(Sj));
        System.out.println(System.identityHashCode(Sji));
        System.out.println(System.identityHashCode(Sjs));
        System.out.println(Sj == Sjs);// true

        char[] ct =
        { 't', 'r', 'u', 'e' };
        String st = new String(ct);
        String sti = st.intern();
        String sts = "true";
        System.out.println(System.identityHashCode(st));
        System.out.println(System.identityHashCode(sti));
        System.out.println(System.identityHashCode(sts));
        System.out.println(st == sts);// false


    }
}

The program prints, for each string, the identity hash code of

  • the string that is created with new String
  • the string that is returned by String#intern
  • the string that is given as a literal

The output is along the lines of this:

366712642
366712642
366712642
true
1829164700
2018699554
2018699554
false
1311053135
1311053135
1311053135
true
118352462
1550089733
1550089733
false

One can see that for the String "java", the hash code of the new String is different from that of the string literal, but that the latter is the same as the one for the result of calling String#intern - which means that String#intern indeed returned a string that is deeply identical to the literal itself.

I also added the String "true" as another test case. It shows the same behavior, because one can assume that the string true will already have appeared before when bootstrapping the VM.

Schnook answered 27/3, 2018 at 21:39 Comment(3)
Do you know why OpenJDK on Linux prints true, true, true, false? I thought that a common word like "java" will be interned. Is there a way to inspect what's inside the String Pool?Onesided
I have tried with String "false" and it is returning true as expected. @Schnook Do we have any mechanism to know various strings which have already appeared before when VM is bootstrapped?Czar
@KarolDowbecki Again, it's close to impossible to tell what the JVM is doing internally here. There once was a question about how to print the whole string pool, and I tried to answer it with some adventurous hack, but really looking at the JVM-internal pool from Java side is almost certainly impossible (my gut feeling is that it could also be a security issue). A wild guess: Maybe in the Oracle JVM, the string appears only as part of other strings, like "java.lang...", and not as an individual literal?Schnook
J
1

You are not using intern correctly. intern does not modify the string object it's called about (strings are immutable anyway), but returns the canonical representation of that string - which you are just discarding. Instead, you should assign it to a variable and use that variable in your checks. E.g.:

sj1 = sj1.intern();
Jibber answered 27/3, 2018 at 21:12 Comment(0)
O
1

On OpenJDK 1.8.0u151 and OpenJDK 9.0.4

char[] cj = {'j','a','v','a'};
String sj = new String(cj);
sj.intern();
String sc = "java";
System.out.println(sj == sc); 

prints true. However this == check depends on what Strings has been interned to the String Pool before String sc = "java" is executed. Since compile time String constants are interned by the Java compiler the sc reference now points to "java" in the String Pool which was put there with sj.intern() using s1 reference.

If you try allocating the String "java" before like:

String before = "java"; // interned before by compiler
char[] cj = {'j','a','v','a'};
String sj = new String(cj);
sj.intern();
String sc = "java";
System.out.println(sj == sc);

the code will now print false since sj.intern() will now have no side effects as the "java" String was interned before.

To debug your problem check what's inside the interned String Pool before you reach the failing check. This might depend on your JVM vendor or version.

One would argue that calling intern() just for the side effect of adding the value into the String Pool is pointless. Writing sj = sj.intern() is the right way to intern the String.

Onesided answered 27/3, 2018 at 21:24 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.