How many String objects will be created
Asked Answered
F

3

7

I have the following Java code:

public String makinStrings() {
  String s = "Fred";
  s = s + "47";
  s = s.substring(2, 5);
  s = s.toUpperCase();
  return s.toString();
}

The question is somehow simple: how many String objects will be created when this method is invoked?

At the beginning I answered that 5 String objects are created, but the answer from my book says that only 3 objects are created and no explanation was given (this is a SCJP question).

From my point of view there are 5 objects: "Fred", "47", "Fred47", "ed4", "ED4".

I also found this question on a SCJP simulation exam, with the same answer 3.

Flirtation answered 10/9, 2011 at 8:36 Comment(3)
I imagine the compiler will inline the first two statements and remove the last (redundant) method call. That leaves you with "Fred47", "ed4", "ED4".Expostulatory
@Jared s is not a compile-time constant expression so that doesn't happen. (Compile the code and use javap -c, or even strings.)Largehearted
possible duplicate of Java - How many String Objects?Shuffleboard
O
15

"Fred" and "47" will come from the string literal pool. As such they won't be created when the method is invoked. Instead they will be put there when the class is loaded (or earlier, if other classes use constants with the same value).

"Fred47", "ed4" and "ED4" are the 3 String objects that will be created on each method invocation.

Omeromero answered 10/9, 2011 at 8:50 Comment(6)
Won't "Fred47", "ed4" and "ED4" be in the string literal pool after the first method invocation, too? I believe Jared's answer/comment is correct (+1), because all the compilers I know of do these simple inlinings. That's why concatenating strings is no longer so extremely expensive (though still to some extent).Sheilahshekel
It is apparently legal for a JVM to instantiate string literals when they are first loaded from the constant pool, instead of loading them easily at class load time. I don't think this implementation detail is really relevant to the original question, but it's an interesting piece of trivia.Alleyway
Note if the first two lines had been written as a single statement, "Fred47" would be used directly as "Fred" + "47" is a compile-time constant expression. Also, it's possible for the library to be differently implemented and produce a different number of string objects.Largehearted
@Joachim Sauer.. how will it put something even before creating it...? Even a String literal has to be instantiated first and later its re-Used.Cynewulf
@ntc: yes, but String literals are defined at a class level (you can take a look at the class file format specification to verify that) and a JVM can easily load them all when the class is loaded. As Stuart Cook noted it need not do that at class loading time, however.Omeromero
@DaveBall: why should they? They are not string literals (or, more precisely: they are not compile time constant expressions of type String). They are normal String instances and unless someone calls intern() on them there's no reason for them to be interned.Omeromero
C
2

Programs tend to contain a lot of String literals in their code. In Java, these constants are collected in something called the string table for efficiency. For instance, if you use the string "Name: " in ten different places, the JVM (typically) has just one instance of that String and in all ten places where it's used, the references all point to that one instance. This saves memory.

This optimization is possible because String is immutable. If it were possible to change a String, changing it one place would mean it changes in the other nine as well. That's why any operation that changes a String returns a new instance. That's why if you do this:

String s = "boink";
s.toUpperCase();
System.out.println(s);

it prints boink, not BOINK.

Now there's one more tricky bit: multiple instances of java.lang.String may point to the same underlying char[] for their character data, in other words, they may be different views on the same char[], by using just a slice of the array. Again, an optimization for efficiency. The substring() method is one of the cases where this happens.

s1 = "Fred47";

//String s1: data=[ 'F', 'r', 'e', 'd', '4', '7'], offset=0, length=6
//                   ^........................^

s2 = s1.substring(2, 5);

//String s2: data=[ 'F', 'r', 'e', 'd', '4', '7'], offset=2, length=3
//                             ^.........^
// the two strings are sharing the same char[]!

In your SCJP question, all this boils down to:

  • The string "Fred" is taken from the String table.
  • The string "47" is taken from the String table.
  • The string "Fred47" is created during the method call. //1
  • The string "ed4" is created during the method call, sharing the same backing array as "Fred47" //2
  • The string "ED4" is created during the method call. //3
  • The s.toString() doesn't create a new one, it just returns this.

One interesting edge case of all this: consider what happens if you have a really long String, for example, a web page taken from the Internet, let's say the length of the char[] is two megabytes. If you take the substring(0, 4) of this, you get a new String that looks like it's just four characters long, but it still shares those two megabytes of backing data. This isn't all that common in the real world, but it can be a huge waste of memory! In the (rare) case that you run into this as a problem, you can use new String(hugeString.substring(0, 4)) to create a String with new, small backing array.

Finally, it's possible to force a String into the string table at runtime by calling intern() on it. The basic rule in this case: don't do it. The extended rule: don't do it unless you've used a memory profiler to ascertain that it's a useful optimization.

Celluloid answered 10/9, 2011 at 9:20 Comment(1)
This is incorrect. None of the String constants are comming from a "String Pool", they all come from the classes constant pool. If the same literal constant like "Fred47" is used in severalm classes, every one of them has the same constant in its constant pool. See the bytecode of Chip McCormick's answer.Vivyan
L
2

Based on the javap output, it looks like during concatentation a StringBuilder is created, not a String. There are then three Strings called for substring(), toUpperCase() and toString().

The last call is not redundant because it transforms the StringBuilder into a String.

>javap -c Test
Compiled from "Test.java"

public java.lang.String makinStrings();
Code:
0:   ldc     #5; //String Fred
2:   astore_1
3:   new     #6; //class java/lang/StringBuilder
6:   dup
7:   invokespecial   #7; //Method java/lang/StringBuilder."<init>":()V
10:  aload_1
11:  invokevirtual   #8; //Method java/lang/StringBuilder.append:   (Ljava/lang/String;)Ljava/lang/StringBuilder;
14:  ldc     #9; //String 47
16:  invokevirtual   #8; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
19:  invokevirtual   #10; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
22:  astore_1
23:  aload_1
24:  iconst_2
25:  iconst_5
26:  invokevirtual   #11; //Method java/lang/String.substring:(II)Ljava/lang/String;
29:  astore_1
30:  aload_1
31:  invokevirtual   #12; //Method java/lang/String.toUpperCase:()Ljava/lang/String;
34:  astore_1
35:  aload_1
36:  invokevirtual   #13; //Method java/lang/String.toString:()Ljava/lang/String;
39:  areturn

}

Leonard answered 10/9, 2011 at 11:16 Comment(1)
Note that with JDK 17 (and probably since JDK 9), no StringBuilder is created. Rather, according to javap, StringConcatFactory.makeConcatWithConstants is used.Thundery

© 2022 - 2024 — McMap. All rights reserved.