How can I improve performance, if append() is called on StringBuffer (or StringBuilder) consecutively without reusing the target variable
Asked Answered
D

1

5

I have the following piece of code in Java.

String foo = " ";

Method 1:

StringBuffer buf = new StringBuffer();
buf.append("Hello");
buf.append(foo);
buf.append("World");  

Method 2:

StringBuffer buf = new StringBuffer();
buf.append("Hello").append(foo).append("World");

Can someone enlighten me, how method 2 can improve the performance of code?

https://pmd.github.io/pmd-5.4.2/pmd-java/rules/java/strings.html#ConsecutiveAppendsShouldReuse

Dey answered 7/6, 2016 at 7:14 Comment(6)
Try decompiling the two styles with javap to see the difference. Whilst there is a difference in the byte code, this feels like unnecessary micro-optimization, which the JIT would likely deal with for you.Hutner
@AndyTurner the difference will be optimized away by HotSpot; you won't see any performance difference at runtimeMikkimiko
Always so worried about performance these new programmers.Moskow
What makes you think that one is more efficient than the other? What makes you think the difference is large enough to be important?Yuki
Dogma of the day: "Documentation is always correct."Manifestation
@Yuki I am using PMD 5.4.2.So it's shows me rule violations for using 2nd method.Only it's a concern for me.Dey
A
10

Is it really different?

Let's start by analyzing javac output. Given the code:

public class Main {
  public String appendInline() {
    final StringBuilder sb = new StringBuilder().append("some").append(' ').append("string");
    return sb.toString();
  }

  public String appendPerLine() {
    final StringBuilder sb = new StringBuilder();
    sb.append("some");
    sb.append(' ');
    sb.append("string");
    return sb.toString();
  }
}

We compile with javac, and check the output with javap -c -s

  public java.lang.String appendInline();
    descriptor: ()Ljava/lang/String;
    Code:
       0: new           #2                  // class java/lang/StringBuilder
       3: dup
       4: invokespecial #3                  // Method java/lang/StringBuilder."<init>":()V
       7: ldc           #4                  // String some
       9: invokevirtual #5                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
      12: bipush        32
      14: invokevirtual #6                  // Method java/lang/StringBuilder.append:(C)Ljava/lang/StringBuilder;
      17: ldc           #7                  // String string
      19: invokevirtual #5                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
      22: astore_1
      23: aload_1
      24: invokevirtual #8                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
      27: areturn

  public java.lang.String appendPerLine();
    descriptor: ()Ljava/lang/String;
    Code:
       0: new           #2                  // class java/lang/StringBuilder
       3: dup
       4: invokespecial #3                  // Method java/lang/StringBuilder."<init>":()V
       7: astore_1
       8: aload_1
       9: ldc           #4                  // String some
      11: invokevirtual #5                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
      14: pop
      15: aload_1
      16: bipush        32
      18: invokevirtual #6                  // Method java/lang/StringBuilder.append:(C)Ljava/lang/StringBuilder;
      21: pop
      22: aload_1
      23: ldc           #7                  // String string
      25: invokevirtual #5                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
      28: pop
      29: aload_1
      30: invokevirtual #8                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
      33: areturn

As seen, the appendPerLine variant produces a much larger bytecode, by producing several extra aload_1 and pop instructions that basically cancel each other out (leaving the string builder / buffer in he stack, and removing it to discard it). In turn, this means the JRE will produce a larger callsite and has a greater overhead. On the contrary, a smaller callsite improves the chances the JVM will inline the method calls, reducing method call overhead and further improving performance.

This alone improves the performance from a cold start when chaining method calls.

Shouldn't the JVM optimize this away?

One could argue that the JRE should be able to optimize these instructions away once the VM has warmed up. However, this claim needs support, and would still only apply to long-running processes.

So, let's check this claim, and validate the performance even after warmup. Let's use JMH to benchmark this behavior:

import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.Param;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.State;

@State(Scope.Benchmark)
public class StringBenchmark {
    private String from = "Alex";
    private String to = "Readers";
    private String subject = "Benchmarking with JMH";

    @Param({"16"})
    private int size;

    @Benchmark
    public String testEmailBuilderSimple() {
        StringBuilder builder = new StringBuilder(size);
        builder.append("From");
        builder.append(from);
        builder.append("To");
        builder.append(to);
        builder.append("Subject");
        builder.append(subject);
        return builder.toString();
    }

    @Benchmark
    public String testEmailBufferSimple() {
        StringBuffer buffer = new StringBuffer(size);
        buffer.append("From");
        buffer.append(from);
        buffer.append("To");
        buffer.append(to);
        buffer.append("Subject");
        buffer.append(subject);
        return buffer.toString();
    }

    @Benchmark
    public String testEmailBuilderChain() {
        return new StringBuilder(size).append("From").append(from).append("To").append(to).append("Subject")
                .append(subject).toString();
    }

    @Benchmark
    public String testEmailBufferChain() {
        return new StringBuffer(size).append("From").append(from).append("To").append(to).append("Subject")
                .append(subject).toString();
    }
}

We compile and run it and we obtain:

Benchmark                               (size)   Mode  Cnt         Score        Error  Units
StringBenchmark.testEmailBufferChain        16  thrpt  200  22981842.957 ± 238502.907  ops/s
StringBenchmark.testEmailBufferSimple       16  thrpt  200   5789967.103 ±  62743.660  ops/s
StringBenchmark.testEmailBuilderChain       16  thrpt  200  22984472.260 ± 212243.175  ops/s
StringBenchmark.testEmailBuilderSimple      16  thrpt  200   5778824.788 ±  59200.312  ops/s

So, even after warming up, following the rule produces a ~4X improvement in throughput. All these runs were done using Oracle JRE 8u121.

Of course, you don't have to believe me, others have done similar analysis and you can even try it yourself.

Does it even matter?

Well, it depends. This is certainly a micro-optimization. If a system is using Bubble Sort, there are certainly more pressing performance issues than this. Not all programs have the same requirements and therefore not all need to follow the same rules.

This PMD rule is probably meaningful only to specific projects that value performance greatly, and will do whatever it takes to shave a couple ms. Such projects would normally use several different profilers, microbenchmarks, and other tools. And having tools such as PMD keeping an eye on specific patterns will certainly help them.

PMD has many other rules available, that will probably apply to many other projects. Just because this particular rule may not apply to your project doesn't mean the tool is not useful, just take your time to review the available rules and choose those that really matter to you.

Hope that clears it up for everyone.

Anfractuous answered 24/1, 2017 at 21:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.