Counting String objects created by Java code [duplicate]
Asked Answered
L

14

24

How many String objects are created by the following code?

String x = new String("xyz");
String y = "abc";
x = x + y;

I have visited many websites where some say that this line of code creates 3 objects and some say it creates 4. I just wanted to know how many objects are created after this line of code is executed.

Lap answered 1/4, 2015 at 12:13 Comment(5)
Take a look through the 1.070.000 search results first and clarify how your question differs from them: google.com/…Photoactive
No that's not a good duplicate. The reassignment of reference x adds an important consideration here.Slacken
The reassignment makes no difference. It is the expression that makes the differenceHaugh
"So this bounty is an attempt to get an answer that can clarify this with proper references. Please take into account any changes in Java 8 that may prove that the answers present are lacking in some details." 1) The accepted answer is 100% correct. If you accept that it is correct, no further clarification is required. If you don't "believe" it, then clarification won't help. The answers that just give a number are incorrect. If you want us to (magically) make all answers say the same thing ... your are dreaming, sunshine.Haugh
2) No changes in Java 8 change the correctness answer, or require clarification. The only thing that changes in Java 8 is that the string pool doesn't live in permgen because permgen no longer exists. But the accepted answer doesn't mention permgen. (It is only mentioned in a couple of tangential comments about pemgen GC that are only of (ancient) historical interest.)Haugh
T
53

By the end of the run there will be four String objects:

  1. A String that corresponds to the interned "xyz" literal
  2. Its copy created by new String("xyz")
  3. A String that corresponds to the interned "abc" literal
  4. A String that corresponds to concatenation "xyz" + "abc"

The real question is attributing some or all of these objects to your program. One can reasonably claim that as few as two or as many as four Strings are created by your code. Even though there are four String objects in total, objects 1 and 3 may not necessarily be created by your code, because they are in a constant pool, so they get created outside your code's direct control.

Tahitian answered 1/4, 2015 at 12:25 Comment(5)
after the lines are executed the 1. and 2. are ready for being garbage collected. Actually right after the String constructor finishes there is no pointer to the original "xyz" literalThoracoplasty
"String object 4 can be computed by the compiler and turned into an interned constant as well" - Not correct. The expression is not a compile-time constant expression according to the JLS rules.Haugh
@Thoracoplasty Literals are a special case, as there must be some form of reference from the defining class to the literal. Depending on the actual VM implementation they may even never be garbage collected (permanent generation in older Hotspot VMs).Lendlease
@Lendlease - really, really old Hotspot VMs. Permgen GC was certainly implemented by Java 5.0.Haugh
Is there any way to count how many objects are created by string literal and String object ?Pleach
H
17

This Answer is to correct a misconception that is being put about by some of the other Answers:

For example:

The compiler might substitute x + y with a constant ("xyzabc"), though. @Binkan Salaryman

... and String object 4 [the String that corresponds to concatenation] can be computed by the compiler and turned into an interned constant as well. @dasblinkenlight

This is incorrect. The JLS states this:

15.18.1. String Concatenation Operator +

....

The String object is newly created (§12.5) unless the expression is a constant expression (§15.28).

In order to qualify as a constant expression, the variable names in the expression must be:

Simple names (§6.5.6.1) that refer to constant variables (§4.12.4).

where a "constant variable" is defined as:

A constant variable is a final variable of primitive type or type String that is initialized with a constant expression (§15.28).

In this example, neither x or y are final so they are not constant variables. And even if they were final, y still wouldn't be a constant variable because of the use of the new operator in its initialization.


In short, the Java compiler is not permitted to use an intern'd constant "xyzabc" as the result of the concatenation expression.

If I added the following statement at the end:

    System.out.println(x == "xyzabc");

it will always print false ... assuming that the compiler is conformant to the Java Language Specification.

Haugh answered 1/4, 2015 at 12:14 Comment(7)
It is entirely possible within the rules though for the JIT compiler to decide not to create anything other than the result (which, as you point out, has to be a new string on every invocation).Samaveda
That is correct. However, that is an optimization that is less likely than most ... because the cost-benefit analysis is not favorable.Haugh
Yes, it was more just an example of why people should be very careful when reasoning about this kind of thing. (And ultimately why it makes better sense to actually measure it in specific Java implementations instead.)Samaveda
This is all very good but it's a pity that the question remains unanswered. How many strings?Borecole
@SlodgeMonster - See my other answer. 1) The question is ambiguous, 2) The answer is implementation dependent.Haugh
So the question boils down to When are Java Strings interned? Linked SO Question is interesting because it notes the difference between 'interning' and use of the Class Constant Pool. Java 8 Language Specification even states "Constant expressions of type String are always "interned" so as to share unique instances, using the method String.intern". Added with this answer, it sketches a nice complete picture.Ulda
Well yes. But the point I am making is that x + y in the OP's question is NOT a constant expression. Therefore it is not implicitly interned. In this case, the "optimization" is not permitted by the JLS.Haugh
G
12

Take a look at decompiled class and you'll see everything :) The answer should be:

  • two strings ("xyz" and "abc") are only references to positions in constant pool so these ones are not created by your code
  • one string is created directly (new String("xyz"))
  • string concatenation is optimised by compiler and changed to StringBuilder so the last string is created indirectly

    public java.lang.String method();
    descriptor: ()Ljava/lang/String;
    flags: ACC_PUBLIC
    Code:
      stack=3, locals=3, args_size=1
     0: new           #2                  // class java/lang/String
     3: dup
     4: ldc           #3                  // String xyz
     6: invokespecial #4                  // Method java/lang/String."<init>":(Ljava/lang/String;)V
     9: astore_1
    10: ldc           #5                  // String abc
    12: astore_2
    13: new           #6                  // class java/lang/StringBuilder
    16: dup
    17: invokespecial #7                  // Method java/lang/StringBuilder."<init>":()V
    20: aload_1
    21: invokevirtual #8                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
    24: aload_2
    25: invokevirtual #8                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
    28: invokevirtual #9                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
    31: astore_1
    32: aload_1
    33: areturn
    
Geyser answered 22/4, 2015 at 13:24 Comment(1)
It's probably different from version to version of java though right?Catechize
C
9

If you want to test for instances, run this code snippet and look at the output:

import static java.lang.System.identityHashCode;

public class Program {
    public static void main(String... args) {
        String x = new String("xyz");
        String y = "abc";
        String z = x + y;

        System.out.printf("x: %d | %d\n", identityHashCode(x), identityHashCode(x.intern()));
        System.out.printf("y: %d | %d\n", identityHashCode(y), identityHashCode(y.intern()));
        System.out.printf("z: %d | %d\n", identityHashCode(z), identityHashCode(z.intern()));
    }
}

I have the following output using jdk1.7.0_67:

x: 414853995 | 1719175803
y: 1405489012 | 1405489012
z: 1881191331 | 1881191331

That's a total of 4 String instances...

Clairclairaudience answered 1/4, 2015 at 12:53 Comment(7)
The number of instances is not the question. How many strings were created is the question.Clomp
Counting String objects created by Java code is the titleClairclairaudience
exactly; counting strings created.Clomp
Some strings are not created, they are "interned". They just exist as constants in the program. Give this a read: https://mcmap.net/q/55353/-what-is-the-java-string-pool-and-how-is-quot-s-quot-different-from-new-string-quot-s-quot-duplicateClomp
I don't see any flaws with my code. The program creates Strings at runtime-time (keyword: new) and the compiler adds Strings - created at compile-time - to the constant pool (intern)Clairclairaudience
I would never say something is created at compile-time. As far as I am concerned we're discussing how many string objects are dynamically allocated (at runtime).Clomp
@Clomp You're right, nothing is created at compile-time, but they can be created at class loading time. Does that count as part of this code or not? I wouldn't think so.Samaveda
S
9

The answer is 4.

As you have used the new keyword, Java will create a new String object in normal (non pool) memory and x will refer to it. In addition to this, the literal "xyz" will be placed in the string pool which is again another string object.

So, the 4 string objects are:

  1. "xyz" (in non-pool memory)
  2. "xyz" (in pool memory)
  3. "abc" (in pool memory)
  4. "xyzabc" (in non-pool memory)

If your code had been like this:

String x = "xyz";
String y = "abc";
x = x + y;

then the answer would be 3.

Note: String #4 is in non-pool memory because String literals and the strings produced by evaluating constant expressions (see JLS §15.28) are the only strings that are implicitly interned.

Source: SCJP Sun Certified Programmer for Java 6 (Page:434, Chapter 6)

Stepdame answered 17/4, 2015 at 18:5 Comment(10)
The same concept is also said here #1882422 as mentioned by @JeroenVannevelStepdame
String #4 is not in pool memory. See my answer.Haugh
@StephenC I didn't get you, your answer says a new String object will be created unless and until the expression is a constant expression. And String objects are created in pool memory right? So, can you clarify your point and give a reference from JLS stating the same?Stepdame
@Ramswaroop - 1) A new string object is created unless the expression is a constant expression. (According to the strict JLS definition of a constant expression.) 2) String objects are NOT normally created in pool memory. Only strings created by String.intern() go into the string pool. (That is how the classloader creates the string objects that correspond to string literals, and string constant expressions ... for example.)Haugh
@StephenC "String objects are NOT normally created in pool memory" - Can you give a reference stating this? So far as I know it is incorrect.Stepdame
The source code is the ultimate reference. Look at the source code of the String class.Haugh
@StephenC I have checked the source code and by default all string objects are interned unless it is created using new keyword. This is the only case where you actually need to invoke intern() method if you need the string created using new to be interned.Stepdame
You missed the point. All operations on String objects that create other String objects except for the intern method create them using new String(....) under the hood. Read the source code for java.lang.String here: grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/…. It will confirm that fact.Haugh
String literals and strings produced by evaluating constant expressions (see JLS §15.28) are the only strings that are implicitly interned.Haugh
@StephenC I have emphasized this point of yours in my answer. Thanks!Stepdame
B
4

I would say 4 because:

Here's how:

String x = new String("xyz"); // 2 objects created: the variable and the constant
String y = "abc"; // 1 object created: the variable
x = x + y; // 1 object created: the one by the StringBuilder class
Batts answered 17/4, 2015 at 16:36 Comment(1)
Different string literals are the same object. JLS #3.10.2.Thermomotor
C
3

new String(”xyz“) for sure creates a new instance. "abc" and "xyz" are stored in the class constant pool, x = x + y creates a StringBuilder under the hood and therefore creates a new String, so the count of strings are 4 here.


The compiler might substitute x + y with a constant ("xyzabc"), though.

Clairclairaudience answered 1/4, 2015 at 12:20 Comment(3)
I believe you can do it in 3 without violating any JLS requirementsSlacken
I can, but that doesn't mean the compiler does.Clairclairaudience
"The compiler might substitute x + y with a constant ("xyzabc"), though. " - Actually it is not permitted to. Unless the expression qualifies as a compile time constant expression (and it doesn't), the concatenation is required to create a new String objectHaugh
H
3

The reason that you see different answers to the question is (in part) that it is ambiguous.

"How many String objects are created by the following code?"

The ambiguity is in the phrase "created by the following code":

  • Is it asking about the number of String objects that are created by (just) the execution of the code?

  • Or ... is it asking about the number of String objects that need to exist during the execution of the code?

You could argue that the code implicitly creates the String objects that correspond to the literals, not when it is run, but when it is loaded. However, those objects could be shared with other code that uses the same literals. (And if you look hard enough, it is possible that other string objects containing the same character strings get created during the class loading process.)


Another reason you see different answers is that it is not entirely clear how many strings get created. The various specifications state that new String objects will be created at certain points, but there is "wriggle room" on whether intermediate String objects could be created.

For example, the JLS states that new always creates a new object, and that the string concatenation operator creates a new object except under certain clearly specified cases (see my other answer). However, the specs do not forbid creation of other strings behind the scenes.


But in this case, if we assume that we are using a modern Hotspot JVM, then:

  • 2 string objects exist (for the string literals) before the code starts executing, and
  • 2 new string objects are created (by the new and the + operator) when the code is executed.

The existence / creation of these 4 string is guaranteed by the JLS.

Haugh answered 24/4, 2015 at 7:15 Comment(0)
B
1

The answer is 4

String x = new String("xyz");//First Object

String y = "abc";//Second Object

x = x + y;//Third, fourth Object
Bearded answered 20/4, 2015 at 11:21 Comment(3)
The x + y expression always generates a new String because x and y are not constant variables.Haugh
But there is a string like x,y in the string pool so the x and y refer to an other String... Java just create new strings if there is no string in the string poolBearded
That is not what the JLS says. See my answer.Haugh
P
1

Sometime it's better to let byte code speak Examining the code using JAVAP

public static void main(java.lang.String[]);
    flags: ACC_PUBLIC, ACC_STATIC
    Code:
      stack=3, locals=3, args_size=1
         0: new           #16                 // class java/lang/String
         3: dup
         4: ldc           #18                 // String xyz
         6: invokespecial #20                 // Method java/lang/String."<init>":(Ljava/lang/String;)V
         9: astore_1
        10: ldc           #23                 // String abc
        12: astore_2
        13: new           #25                 // class java/lang/StringBuilder
        16: dup
        17: aload_1
        18: invokestatic  #27                 // Method java/lang/String.valueOf:(Ljava/lang/Object;)Ljava/lang/String;
        21: invokespecial #31                 // Method java/lang/StringBuilder."<init>":(Ljava/lang/String;)V
        24: aload_2
        25: invokevirtual #32                 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/St
ringBuilder;
        28: invokevirtual #36                 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
        31: astore_1
        32: return
      LineNumberTable:
        line 6: 0
        line 7: 10
        line 8: 13
        line 9: 32
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
               0      33     0  args   [Ljava/lang/String;
              10      23     1     x   Ljava/lang/String;
              13      20     2     y   Ljava/lang/String;
}

Now as seen form the code

At `0: new` Creates a new String Object 
At `3:dup` ; make an extra reference to the new instance

    At `4:ldc #18` as seen literal "xyz" has been placed in the pool (one string Object) 
At `6: invokespecial;` ; now call an instance initialization method with parameter and creates a object in nonpool memory.
At `9: astore_1` Stores the above reference in local variable 1(i.e x)

So by this time we have two String object

At `10:ldc #23` as seen literal "abc" has been placed in the pool (third string ) 

    At `12: astore_2` Stores the above reference in local variable (i.e y)

so By this time we have three String Object

    28: invokevirtual #36 // Method java/lang/StringBuilder.toString:
()Ljava/lang/String;;(fourth String Object is Created)

So we have total of four String Object in this code.

As i'm new to programming and have started it learning only few months back point me if i went wrong somewhere and what is the correct version of it. Thanks :)

Pagel answered 23/4, 2015 at 21:47 Comment(1)
Yes. But as is often the case with "disassembly" answers, this is not the complete picture. It doesn't consider whether the string objects corresponding to the literals were created during class loading ... and whether they still exist at the time the code was executed. It also doesn't consider what happens / could happen when those bytecodes are optimized.Haugh
G
1
Line 1:String x = new String("xyz");
Line 2:String y = "abc";
Line 3:x = x + y;

Strings are Immutable so if any existing string variable need to be changed then new object will be created for assignment. Line 1,Line 2 are string objects where as Line 3 is modification of the existing string variable so new allocation need to be done to add x+y. So it should create creates 3 Objects.

Genus answered 24/4, 2015 at 8:55 Comment(0)
S
0

4 Objects are created . String x = new String("xyz"); it creates 2 objects one in scp and one in heap. enter image description here String y = "abc"; it creates one object in scp enter image description here x = x + y; it creates one more object in scp and it is referenced by x replacing its old object reference in scp. enter image description here

Sweyn answered 16/5 at 18:23 Comment(0)
M
-1

1.objects created in heap area "xyz" //created by 'String x' and xyzabc //created by 'x+y'(concatenation)

2.objects created in scp(string constant pool) "xyz" //created for future purpose which is not available for garbage collection and "abc" //created by 'String y' literal

so total objects created in this case is 4

Melnick answered 22/9, 2018 at 16:13 Comment(0)
W
-1

The answer is 5

  1. xyz in non pool memory
  2. xyz in pool memory with no reference
  3. abc in pool memory with reference
  4. xyz still in non pool memory, reference changed to xyzabc in non pool memory
  5. xyzabc in pool memory with no reference
Welcome answered 2/7, 2019 at 5:32 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.