You would be amazed how much effort was put into jdk-9 String concatenation. First javac emits an invokedynamic
instead of an invocation to StringBuilder#append
. That invokedynamic will return a CallSite
with contains a MethodHandle (that is actually a series of MethodHandles).
Thus the decision of what is actually done for a String concatenation is moved to the runtime. The downside is that the first time you concatenate Strings that is going to be slower (for the same type of arguments).
Then there are a series of strategies you can choose from when concatenating a String(you can override the default one via java.lang.invoke.stringConcat
parameter):
private enum Strategy {
/**
* Bytecode generator, calling into {@link java.lang.StringBuilder}.
*/
BC_SB,
/**
* Bytecode generator, calling into {@link java.lang.StringBuilder};
* but trying to estimate the required storage.
*/
BC_SB_SIZED,
/**
* Bytecode generator, calling into {@link java.lang.StringBuilder};
* but computing the required storage exactly.
*/
BC_SB_SIZED_EXACT,
/**
* MethodHandle-based generator, that in the end calls into {@link java.lang.StringBuilder}.
* This strategy also tries to estimate the required storage.
*/
MH_SB_SIZED,
/**
* MethodHandle-based generator, that in the end calls into {@link java.lang.StringBuilder}.
* This strategy also estimate the required storage exactly.
*/
MH_SB_SIZED_EXACT,
/**
* MethodHandle-based generator, that constructs its own byte[] array from
* the arguments. It computes the required storage exactly.
*/
MH_INLINE_SIZED_EXACT
}
The default strategy is: MH_INLINE_SIZED_EXACT
which is a beast!
It uses the package-private constructor to build the String (which is the fastest):
/*
* Package private constructor which shares value array for speed.
*/
String(byte[] value, byte coder) {
this.value = value;
this.coder = coder;
}
First this strategy creates so called filters; these are basically method handles that would transform the incoming parameter to a String value. As one might expect, these MethodHandles are stored in a class called Stringifiers
that in most cases produce a MethodHandle that calls:
String.valueOf(YourInstance)
So if you have 3 Objects that you want to concatenate there will be 3 MethodHandles that will delegate to String.valueOf(YourObject)
which effectively means that you have transformed your objects into Strings.
There are certain tweaks inside this class that I still can't understand; like the need to have separate classes StringifierMost
(that transforms to String only References, float and doubles) and StringifierAny
.
Since the MH_INLINE_SIZED_EXACT
says that the byte array is computed to exact size; there is a way to compute that.
The way this is done is via methods in StringConcatHelper#mixLen
which take Stringified version of your input parameters (References/float/double). At this point we know the size of our final String. Well, we don't actually know it, we have a MethodHandle that will compute it.
There's one more change in String jdk-9 that is worth mentioning here - addition of a coder
field. This is needed to compute the size/equality/charAt of a String. Since it's needed for the size, we need to compute it also; this is done via StringConcatHelper#mixCoder
.
It is safe at this point to delegate a MethodHandle that will create ur array:
@ForceInline
private static byte[] newArray(int length, byte coder) {
return (byte[]) UNSAFE.allocateUninitializedArray(byte.class, length << coder);
}
How is each element appended? Via methods in StringConcatHelper#prepend
.
And only now we need all the details needed to invoke that constructor of the String that takes a byte.
All these operations (and many others I have skipped for simplicity) are handled via emitting a MethodHandle that will be invoked when the appending actually happens.
StringBuilder
to repeatedly reallocate. But that was specific to Oracle's JDK, looking at the resulting bytecode, and so didn't account for any optimization the JVM might do. My rule was: 99.999% of the time you don't care, of course; for the .001% where you care, use an explicitStringBuilder
allocated big enough to handle the total result. – FinancierStringBuilder
manually (strictly as a local variable). But if the code is just a single line, concatenating 2-3-4 values, I wouldn't bother. – Inelegance