How is the zero length array represented in memory?
Asked Answered
T

4

7

The Java primitive objects are mapped to native primitives.
So my question is how is a char value[] = new char[0]; is being represented?
Does it depend on the gcc compiler implementation (of the native code)? Would that mean that all empty Java Strings point to the same address?

Taphouse answered 8/5, 2015 at 21:10 Comment(6)
Java primitives are not necessarily mapped to native primitives. A Java int is 4 bytes, even if it is implemented on an 8-bit architecture. A Java char is always two bytes.Intrusive
Do you mean the String object itself or the String's internal representation of the characters?Sankaran
An empty string would be an object with a length of 0, and that length should be stored somewhere. Would it matter if all empty strings point to the same address or not?Helices
@CaptainMan:The internal representationTaphouse
@Jongware:I am trying to understand using that as an example why we would not use a null char array insteadTaphouse
@Jim, null and empty arrays are completely different and you can't interchange one for the other.Marine
I
9

Java arrays are objects. They inherit from the Object class.

The JVM spec does not dictate any particular implementation for objects, provided they behave according to the specs. In practice, it is implemented with a header followed by the actual fields of the object.

An array in Java is not just a sequence of its primitive components. It is an object, it has the length field, and it has methods. So, like any other object, it has the header, followed by the length, followed by all the array components.

An array allocated with size zero is an object that has the header and the size but no space allocated for actual components.

A reference to an array is just like a reference to any other object. Arrays in Java are not like arrays in C, where if the array was size zero, the pointer that points to its start would actually be invalid. The reference to an array points to the array object, which happens to have length zero and no actual items. If you try to address any element in such an array, there will not be a question of valid pointers. The array reference itself points to a valid object. Then, bounds checking will show any index to be out of bounds, so no further pointer dereferencing will take place.

So the bottom line is that a reference to a char[0] is a valid reference to an actual allocated object. It simply has no data beyond the length.

And this is different than null which is a reference whose bits are all zero, thus not pointing anywhere at all. No memory other than the reference itself is allocated, whereas for char[0] enough memory is allocated for a header and a length.


As for strings, two empty strings do not necessarily point to the same character array. For example, if you write:

String a = new String();
String b = new String();

You'll get two different empty string objects. Each of them has a distinct empty character array that it points to. This is because the no-args constructor of the String class is implemented like this:

public String() {
    this.value = new char[0];
}

You see the use of the new keyword? This means a new array object is allocated, not copied from anywhere.

Note, however, that if your source was:

String a = "";
String b = "";

Then because of interning, they would be pointing to the same string object, and thus to the same character array. Also, if it was:

String a = new String();
String b = new String(a);

Then you would have two different String objects, but they would both point to the same internal character array. This is because the constructor for the second line is:

public String(String original) {
    this.value = original.value;
    this.hash = original.hash;
}

Again, a pointer to an empty string is certainly not the same as a null pointer. It points to an actual string object which points to an actual character array object.

Intrusive answered 8/5, 2015 at 21:40 Comment(1)
For extra credit, check out the [Java Object Layout ](openjdk.java.net/projects/code-tools/jol) tool for OpenJDK that allows you to inspect the actual object layout after padding etc. for your Java classes.Uvula
B
8

Memory layout is undefined because it is an implementation detail.

Here's how IBM describes the memory layout of an array for their 64bit JVM:

  1. 64 bits for a class pointer (i.e. signalling char)
  2. 64 bits for flags (e.g. saying that this object is an array)
  3. 64 bits for lock data (for synchronization)
  4. 64 bits for array length (only 32 bits are used, but the field is boundary aligned)
  5. 0 bits for data, since the array has no elements

That's a total of 256 bits or 32 bytes.

In Java, a String and a char[] are not the same thing. A String will be a separate object containing a reference to a char[].

Berserk answered 8/5, 2015 at 21:22 Comment(5)
What is the memory layout of char[0]?Taphouse
@Taphouse The answer tries to explain that in bit level detail. Is there something that is not clear?Berserk
So basically the char[0] when accessed goes to the 4th word and finds as length 0 which means that there is no data present i.e. there is no valid memory address in the 5th word and on. How does that help instead of just having null then? And also it is different in C++ right?Taphouse
null isn't an object. null is completely different from an empty array.Marine
@Taphouse I don't understand. Do you mean "Why should you use empty arrays instead of null?" There are great reasons why, described at length elsewhere, but none of them are about internal memory representation.Berserk
M
5

Two different objects created with new must be distinct with regards to reference equality, so no, they're not the same object.

Separately, any two Java String references to the constant string "" will refer to the same object, because compile-time constant strings get interned.

Marine answered 8/5, 2015 at 21:16 Comment(7)
So char[0] is a valid memory address? Because I thought that this is some kind of "syntactic" sugar for the compilerTaphouse
No, new char[0] creates a new, perfectly valid empty array object.Marine
What do you mean by empty array object? This is an array with 0 length. What is the representation of this "perfectly valid empty array object"?Taphouse
Yes, it's an array with 0 length. On 32-bit HotSpot, arrays are represented by the normal 8-byte header for all objects, 4 bytes for the length field, and then the contents of the array, and a zero-length array works just the same as any other array.Marine
The same way it's always done. It's rounded up to a multiple of the padding size. There is nothing special about zero-length arrays.Marine
A li'l bit of nerd, but new String("") will not refer to ""Melanoid
@DmitryGinzburg, that's why I specified references to the constant string "", not just empty strings generally.Marine
T
2

Since each array object has the length property, when writing

char a[] = new char[0];

Then the length property gets the value 0, which represents the array's size. The length field is 4 bytes, and the array has a normal header that's usually 8 bytes.

Nothing special about an empty array, it's just like any other array, but it doesn't contain elements.

It's worth mentioning that an empty array and an array that's initialized to null are two different things. For example, sometimes it's just easier to return an empty array from a method instead of null.

Tusker answered 8/5, 2015 at 21:23 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.