An immutable object is an object where the internal fields (or at least, all the internal fields that affect its external behavior) cannot be changed.
There are a lot of advantages to immutable strings:
Performance: Take the following operation:
String substring = fullstring.substring(x,y);
The underlying C for the substring() method is probably something like this:
// Assume string is stored like this:
struct String { char* characters; unsigned int length; };
// Passing pointers because Java is pass-by-reference
struct String* substring(struct String* in, unsigned int begin, unsigned int end)
{
struct String* out = malloc(sizeof(struct String));
out->characters = in->characters + begin;
out->length = end - begin;
return out;
}
Note that none of the characters have to be copied! If the String object were mutable (the characters could change later) then you would have to copy all the characters, otherwise changes to characters in the substring would be reflected in the other string later.
Concurrency: If the internal structure of an immutable object is valid, it will always be valid. There's no chance that different threads can create an invalid state within that object. Hence, immutable objects are Thread Safe.
Garbage collection: It's much easier for the garbage collector to make logical decisions about immutable objects.
However, there are also downsides to immutability:
Performance: Wait, I thought you said performance was an upside of immutability! Well, it is sometimes, but not always. Take the following code:
foo = foo.substring(0,4) + "a" + foo.substring(5); // foo is a String
bar.replace(4,5,"a"); // bar is a StringBuilder
The two lines both replace the fourth character with the letter "a". Not only is the second piece of code more readable, it's faster. Look at how you would have to do the underlying code for foo. The substrings are easy, but now because there's already a character at space five and something else might be referencing foo, you can't just change it; you have to copy the whole string (of course some of this functionality is abstracted into functions in the real underlying C, but the point here is to show the code that gets executed all in one place).
struct String* concatenate(struct String* first, struct String* second)
{
struct String* new = malloc(sizeof(struct String));
new->length = first->length + second->length;
new->characters = malloc(new->length);
int i;
for(i = 0; i < first->length; i++)
new->characters[i] = first->characters[i];
for(; i - first->length < second->length; i++)
new->characters[i] = second->characters[i - first->length];
return new;
}
// The code that executes
struct String* astring;
char a = 'a';
astring->characters = &a;
astring->length = 1;
foo = concatenate(concatenate(slice(foo,0,4),astring),slice(foo,5,foo->length));
Note that concatenate gets called twice meaning that the entire string has to be looped through! Compare this to the C code for the bar
operation:
bar->characters[4] = 'a';
The mutable string operation is obviously much faster.
In Conclusion: In most cases, you want an immutable string. But if you need to do a lot of appending and inserting into a string, you need the mutability for speed. If you want the concurrency safety and garbage collection benefits with it the key is to keep your mutable objects local to a method:
// This will have awful performance if you don't use mutable strings
String join(String[] strings, String separator)
{
StringBuilder mutable;
boolean first = true;
for(int i = 0; i < strings.length; i++)
{
if(first) first = false;
else mutable.append(separator);
mutable.append(strings[i]);
}
return mutable.toString();
}
Since the mutable
object is a local reference, you don't have to worry about concurrency safety (only one thread ever touches it). And since it isn't referenced anywhere else, it is only allocated on the stack, so it is deallocated as soon as the function call is finished (you don't have to worry about garbage collection). And you get all the performance benefits of both mutability and immutability.