Java, HashMaps and using Strings as the keys - does the string value get stored twice?
Asked Answered
T

3

4

If I have a HashMap that looks like this:

HashMap<String, MyObject>

where the String key is a field in MyObject, does this string value get stored twice?

So when I add entries:

_myMap.put(myObj.getName(), myObj);

Am I using double the String size in terms of memory? Or does Java do something clever behind the scenes?

Thanks

Tinhorn answered 13/3, 2010 at 9:45 Comment(1)
Note the answer you've marked "accepted" has less than half as many votes as another answer. When this happens, it's probably a good idea to double-check and make sure you've really accepted the right one.Kistler
V
4

Java uses the reference, so it is just a pointer to the string that it stores twice. So you don't have to worry if your string is huge, it will still be the same amount of memory that is used.

Vitia answered 13/3, 2010 at 9:52 Comment(1)
Actually, it depends on how getName() is implemented. See my answer.Intertidal
I
16

Unless you're actually creating a new String value in getName(), you're not duplicating your memory usage.

Here are a few examples to clarify things:

 String s1 = "Some really long string!";
 String s2 = s1;
 assert s1.equals(s2);

Here, s1 == s2; they refer to the same String instance. Your memory usage is 2 reference variables (no big deal), 1 String instance, and 1 backing char[] (the part that takes up memory).


 String s1 = "Some really long string!";
 String s2 = new String(s1);
 assert s1.equals(s2);

Here, s1 != s2; they refer to different String instances. However, since strings are immutable, the constructor knows that they can share the same character array. Your memory usage is 2 reference variables, 2 String instances (still no big deal, because...), and 1 backing char[].


 String s1 = "Some really long string!";
 String s2 = new String(s1.toCharArray());
 assert s1.equals(s2);

Here, just like before, s1 != s2. A different constructor is used, this time, however, that takes a char[] instead. To ensure immutability, toCharArray() must return a defensive copy of its internal array (that way any changes to the returned array would not mutate the String value).

[toCharArray() returns] a newly allocated character array whose length is the length of this string and whose contents are initialized to contain the character sequence represented by this string.

To make matters worse, the constructor must also defensively copy the given array to its internal backing array, again to ensure immutability. This means that as many as 3 copies of the character array may live in the memory at the same time! 1 of those will be garbage-collected eventually, so your memory usage is 2 reference variables, 2 String instances, and 2 backing char[]! NOW your memory usage is doubled!


So going back to your question, as long as you're not creating a new String value in getName() (i.e. if you just simply return this.name;), then you're fine. If you are doing even a simple concatenation, however (e.g. return this.firstName + this.lastName;), then you will double your memory usage!

The following code illustrates my point:

public class StringTest {
    final String name;
    StringTest(String name) {
        this.name = name;
    }
    String getName() {
        return this.name;      // this one is fine!
    //  return this.name + ""; // this one causes OutOfMemoryError!
    }
    public static void main(String args[]) {
        int N = 10000000;
        String longString = new String(new char[N]);
        StringTest test = new StringTest(longString);
        String[] arr = new String[N];
        for (int i = 0; i < N; i++) {
            arr[i] = test.getName();
        }
    }
}

You should first verify that the above code runs (java -Xmx128m StringTest) without throwing any exception. Then, modify getName() to return this.name + ""; and run it again. This time you will get an OutOfMemoryError.

Intertidal answered 13/3, 2010 at 11:13 Comment(4)
Could you add references for the italicized parts?Cran
@JRL: I added reference to toCharArray() defensive copying.Intertidal
Unsure whether to +1 for a truly comprehensive answer, or -1 for TL;DR. Went with the +1 option :)Cohesive
Well, I did put the central theme of my answer in the first line =)Intertidal
V
4

Java uses the reference, so it is just a pointer to the string that it stores twice. So you don't have to worry if your string is huge, it will still be the same amount of memory that is used.

Vitia answered 13/3, 2010 at 9:52 Comment(1)
Actually, it depends on how getName() is implemented. See my answer.Intertidal
M
1

String are immutable, but pass-by-reference still apply. So it won't take twice as much memory.

Megaera answered 13/3, 2010 at 9:53 Comment(1)
There is no pass by reference in Java. References are passed by value, along with everything else.Kylynn

© 2022 - 2024 — McMap. All rights reserved.