Note: This answer uses examples that might not be relevant in modern runtime JVM libraries. In particular, the substring
example is no longer an issue in OpenJDK/Oracle 7+.
I know it goes against what people often tell you, but sometimes explicitly creating new String
instances can be a significant way to reduce your memory.
Because Strings are immutable, several methods leverage that fact and share the backing character array to save memory. However, occasionally this can actually increase the memory by preventing garbage collection of unused parts of those arrays.
For example, assume you were parsing the message IDs of a log file to extract warning IDs. Your code would look something like this:
//Format:
//ID: [WARNING|ERROR|DEBUG] Message...
String testLine = "5AB729: WARNING Some really really really long message";
Matcher matcher = Pattern.compile("([A-Z0-9]*): WARNING.*").matcher(testLine);
if ( matcher.matches() ) {
String id = matcher.group(1);
//...do something with id...
}
But look at the data actually being stored:
//...
String id = matcher.group(1);
Field valueField = String.class.getDeclaredField("value");
valueField.setAccessible(true);
char[] data = ((char[])valueField.get(id));
System.out.println("Actual data stored for string \"" + id + "\": " + Arrays.toString(data) );
It's the whole test line, because the matcher just wraps a new String instance around the same character data. Compare the results when you replace String id = matcher.group(1);
with String id = new String(matcher.group(1));
.