Where do Java and .NET string literals reside?
Asked Answered
V

7

31

A recent question about string literals in .NET caught my eye. I know that string literals are interned so that different strings with the same value refer to the same object. I also know that a string can be interned at runtime:

string now = DateTime.Now.ToString().Intern(); 

Obviously a string that is interned at runtime resides on the heap but I had assumed that a literal is placed in the program's data segment (and said so in my answer to said question). However I don't remember seeing this anywhere. I assume this is the case since it's how I would do it and the fact that the ldstr IL instruction is used to get literals and no allocation seems to take place seems to back me up.

To cut a long story short, where do string literals reside? Is it on the heap, the data segment or some-place I haven't thought of?


Edit: If string literals do reside on the heap, when are they allocated?

Vizzone answered 16/12, 2008 at 20:18 Comment(0)
A
110

Strings in .NET are reference types, so they are always on the heap (even when they are interned). You can verify this using a debugger such as WinDbg.

If you have the class below

   class SomeType {
      public void Foo() {
         string s = "hello world";
         Console.WriteLine(s);
         Console.WriteLine("press enter");
         Console.ReadLine();
      }
   }

And you call Foo() on an instance, you can use WinDbg to inspect the heap.

The reference will most likely be stored in a register for a small program, so the easiest is to find the reference to the specific string is by doing a !dso. This gives us the address of our string in question:

0:000> !dso
OS Thread Id: 0x1660 (0)
ESP/REG  Object   Name
002bf0a4 025d4bf8 Microsoft.Win32.SafeHandles.SafeFileHandle
002bf0b4 025d4bf8 Microsoft.Win32.SafeHandles.SafeFileHandle
002bf0e8 025d4e5c System.Byte[]
002bf0ec 025d4c0c System.IO.__ConsoleStream
002bf110 025d4c3c System.IO.StreamReader
002bf114 025d4c3c System.IO.StreamReader
002bf12c 025d5180 System.IO.TextReader+SyncTextReader
002bf130 025d4c3c System.IO.StreamReader
002bf140 025d5180 System.IO.TextReader+SyncTextReader
002bf14c 025d5180 System.IO.TextReader+SyncTextReader
002bf15c 025d2d04 System.String    hello world             // THIS IS THE ONE
002bf224 025d2ccc System.Object[]    (System.String[])
002bf3d0 025d2ccc System.Object[]    (System.String[])
002bf3f8 025d2ccc System.Object[]    (System.String[])

Now use !gcgen to find out which generation the instance is in:

0:000> !gcgen 025d2d04 
Gen 0

It's in generation zero - i.e. it has just be allocated. Who's rooting it?

0:000> !gcroot 025d2d04 
Note: Roots found on stacks may be false positives. Run "!help gcroot" for
more info.
Scan Thread 0 OSTHread 1660
ESP:2bf15c:Root:025d2d04(System.String)
Scan Thread 2 OSTHread 16b4
DOMAIN(000E4840):HANDLE(Pinned):6513f4:Root:035d2020(System.Object[])->
025d2d04(System.String)

The ESP is the stack for our Foo() method, but notice that we have a object[] as well. That's the intern table. Let's take a look.

0:000> !dumparray 035d2020
Name: System.Object[]
MethodTable: 006984c4
EEClass: 00698444
Size: 528(0x210) bytes
Array: Rank 1, Number of elements 128, Type CLASS
Element Methodtable: 00696d3c
[0] 025d1360
[1] 025d137c
[2] 025d139c
[3] 025d13b0
[4] 025d13d0
[5] 025d1400
[6] 025d1424
...
[36] 025d2d04  // THIS IS OUR STRING
...
[126] null
[127] null

I reduced the output somewhat, but you get the idea.

In conclusion: strings are on the heap - even when they are interned. The interned table holds a reference to the instance on the heap. I.e. interned strings are not collected during GC because the interned table roots them.

Amatruda answered 16/12, 2008 at 20:22 Comment(0)
S
12

In Java (from the Java Glossary):

In Sun’s JVM, the interned Strings (which includes String literals) are stored in a special pool of RAM called the perm gen, where the JVM also loads classes and stores natively compiled code. However, the intered Strings behave no differently than had they been stored in the ordinary object heap.

Sevier answered 16/12, 2008 at 20:38 Comment(1)
A normative reference should be found. You can't just cite or quote arbitrary Internet junk.Taxiway
R
3

Correct me if I am wrong but don't all objects reside on the heap, in both Java and .NET?

Relinquish answered 16/12, 2008 at 20:21 Comment(3)
Value types in .NET resides on the stack unless they are part of a reference type in which case they are on the heap.Amatruda
Right, I would exclude value types from the "object" category, but then again I'm used to Java and not .NETRelinquish
If value-types are small enough they might not even be on the stack but only in registers.Lovering
L
1

In .Net, string literals when "interned", are stored in a special data structure called, the "intern table". This is separate from the heap and the stack. Not all strings are interned however... I'm pretty sure that those that aren't are stored on the heap.

Don't know about Java

Leopard answered 16/12, 2008 at 20:23 Comment(2)
Surely the intern table just holds references to the strings and doesn't store the actual bytes that make up the string?Vizzone
The interned table holds references to the strings on the heap.Amatruda
V
1

I found this on MSDN's site about the ldstr IL instruction:

The ldstr instruction pushes an object reference (type O) to a new string object representing the specific string literal stored in the metadata. The ldstr instruction allocates the requisite amount of memory and performs any format conversion required to convert the string literal from the form used in the file to the string format required at runtime.

The Common Language Infrastructure (CLI) guarantees that the result of two ldstr instructions referring to two metadata tokens that have the same sequence of characters return precisely the same string object (a process known as "string interning").

This implies that the string literals are in fact stored on the heap in .NET (unlike Java as pointed out by mmyers).

Vizzone answered 17/12, 2008 at 7:8 Comment(1)
no, it only says they behave the same as if they were stored on the normal heapEmbitter
D
0

In Java, strings like all objects reside in the heap. Only local primitive variables (ints, chars and references to objects) reside in stack.

Dourine answered 16/12, 2008 at 20:44 Comment(0)
H
-1

Interned String's in java are located in a separate Pool called the String Pool. This pool is maintained by the String class and resides on the normal Heap (not the Perm pool as mentioned above, that is used for storing the class data).

As I understand it not all Strings are interned, but calling myString.intern() returns a String that is guaranteed from the String Pool.

See also: http://www.javaranch.com/journal/200409/ScjpTipLine-StringsLiterally.html and the javadoc http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html#intern()

Hispanicism answered 17/12, 2008 at 13:53 Comment(1)
It is maintained by the compiler and classloader in the case of literal strings. At one time it wa indeed in the PermGen.Taxiway

© 2022 - 2024 — McMap. All rights reserved.