I did a little investigation to find out how the String.intern()
method is implemented in java.
I looked at C++ implementation of Intern pool from Open JDK 6 and there I saw a simple HashSet
. For me it meant that when someone are trying to intern a String
the next steps should be done:
- finding hash code associated with the given
String
- finding an appropriate bucket
- comparing the given String to all another Strings in the bucket. Before this step there may be 0 Strings, one String or a LOT OF Strings in the bucket. So if the given String has been previously put in the bucket we will get at least one comparison (that's the best case. Of course there might have been a lot of collisions and now many other Strings are in the bucket)
- If the String has been found in the bucket then it should be
returned by
intern()
method - If the String has not been found in the bucket then it should be put
in the bucket and returned by
intern()
method
So many people say that str1.intern() == str2.intern()
would be faster than str1.equals(str2)
.
But I cannot see the reason it should be faster.
As I can see in case of str1.equals(str2)
we always have two strings comparing char by char in String.equals()
method.
In case of str1.intern() == str2.intern()
, how many comparisons we would have to get or to put the String to/from the pool (right, it can be a lot of comparisons and they are simple char by char comparisons too)?
So in case of str1.intern() == str2.intern()
even if we use ==
to compare Strings we also will have many additional actions such as comparisons described previously.
When I understood it I decided to make some benchmark testing.
The first results shewed me that str1.intern() == str2.intern()
was faster than str1.equals(str2)
.
This behaviour was caused by the fact that String.intern()
method is native so it shouldn't be interpreted every time and String.equals()
is a java method.
So then I decided to use -Xcomp
option to make JVM compile all the code on start.
After that equals shewed a better speed than intern.
I tested it on Java 6 and 7.
So my question is have you ever seen a situation when interning increased speed of String comparison? I yes how can it be?
Or maybe intern()
can only help to save more free memory?
str1.intern() == str2.intern()
- no! You're supposed to have the strings already interned. Interning them at the comparison site is pure overhead. (Whether interning is useful when you're using it properly is still debatable, but interning like this is just useless.) – ClausiusString.hashCode
method has been optimized for very good distribution, such that in a hash table, you will get very few collisions. – Hufuf