Fastest way to concatenate ReadOnlySpan<char> in C#
Asked Answered
G

1

5

What's the most efficient way to concatenate strings, if I already only have ReadOnlySpan slices?

Simplified example:

public class Program {
    public string ConcatSpans(string longstring) {
        var span = longstring.AsSpan();
        var sb = new StringBuilder(longstring.Length);
        sb.Append(span.Slice(40, 10));
        sb.Append(span.Slice(30, 10));
        sb.Append(span.Slice(20, 10));
        sb.Append(span.Slice(10, 10));
        sb.Append(span.Slice(0, 10));
        return sb.ToString();
    }

    [Benchmark]
    public void ConcatSpansBenchmark() {
        ConcatSpans("aaaaaaaaaabbbbbbbbbbccccccccccddddddddddeeeeeeeeee");
    }

    public static void Main(string[] args) {
        var summary = BenchmarkRunner.Run<Program>();
    }
}

Results:

BenchmarkDotNet=v0.11.2, OS=Windows 10.0.17134.345 (1803/April2018Update/Redstone4)
Intel Core i5-2500K CPU 3.30GHz (Sandy Bridge), 1 CPU, 4 logical and 4 physical cores
.NET Core SDK=2.1.403
  [Host]     : .NET Core 2.1.5 (CoreCLR 4.6.26919.02, CoreFX 4.6.26919.02), 64bit RyuJIT
  DefaultJob : .NET Core 2.1.5 (CoreCLR 4.6.26919.02, CoreFX 4.6.26919.02), 64bit RyuJIT


               Method |     Mean |    Error |   StdDev | Gen 0/1k Op | Gen 1/1k Op | Gen 2/1k Op | Allocated Memory/Op |
--------------------- |---------:|---------:|---------:|------------:|------------:|------------:|--------------------:|
 ConcatSpansBenchmark | 126.6 ns | 1.712 ns | 1.601 ns |      0.0966 |           - |           - |               304 B |

Is StringBuilder really the best we can do? Is there a way to go faster than that? With even less allocations? After all StringBuilder object itself is a heap object.

If there was a ref struct StringBuilder that would only keep references to ReadOnlySpans and in the final ToString just allocate one string object?

Guttering answered 6/11, 2018 at 21:33 Comment(2)
Well, I don't have hard requirements. This is more of an experiment. Trying to find the "easy, yet efficient" sweet spot :). Of course allocating a buffer, copy slices over there sounds like the most efficient way, but would already become harder if length is not known (I know it is in my example, but just assume).Guttering
btw: thanks for starting a perf question by using BenchmarkDotNet - makes it much easier to work with!Sacksen
S
12

Edit: as Tseng notes in the comments, the newer string.Create method is the way to do this on platforms where it exists.


The scenario with multiple (but known) input spans is ideal for a "preallocate a dummy string, then pretend that strings are mutable and overwrite it before exposing it to the world" scenario. This looks gnarly, but this trick is very common in IO code when dealing with strings (especially from discontiguous buffers, etc), so it is well understood and supported.

Here we go (edit: now with added "hybrid" method, which avoids all the Slice() calls, without needing unsafe):

                        Method |     Mean |     Error |    StdDev |   Median |
------------------------------ |---------:|----------:|----------:|---------:|
          ConcatSpansBenchmark | 97.17 ns | 2.1335 ns | 4.0072 ns | 97.20 ns |
       OverwiteStringBenchmark | 63.34 ns | 1.2914 ns | 2.0854 ns | 62.29 ns |
      UnsafeOverwriteBenchmark | 17.95 ns | 0.3697 ns | 0.3796 ns | 17.80 ns |
 OverwiteStringHybridBenchmark | 53.59 ns | 0.5534 ns | 0.5176 ns | 53.49 ns |

Note: anything involving MemoryMarshal.*, Unsafe.* or the unsafe keyword is explicitly "I know what I'm doing... anything that exploded is probably my fault".

Code:

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using System.Text;

public class Program
{
    public string ConcatSpans(string longstring)
    {
        var span = longstring.AsSpan();
        var sb = new StringBuilder(longstring.Length);
        sb.Append(span.Slice(40, 10));
        sb.Append(span.Slice(30, 10));
        sb.Append(span.Slice(20, 10));
        sb.Append(span.Slice(10, 10));
        sb.Append(span.Slice(0, 10));
        return sb.ToString();
    }

    public string OverwiteString(string longstring)
    {
        var span = longstring.AsSpan();
        var s = new string('\0', longstring.Length);
        var writeable = MemoryMarshal.AsMemory(s.AsMemory()).Span;
        span.Slice(40, 10).CopyTo(writeable);
        writeable = writeable.Slice(10);
        span.Slice(30, 10).CopyTo(writeable);
        writeable = writeable.Slice(10);
        span.Slice(20, 10).CopyTo(writeable);
        writeable = writeable.Slice(10);
        span.Slice(10, 10).CopyTo(writeable);
        writeable = writeable.Slice(10);
        span.Slice(0, 10).CopyTo(writeable);
        return s;
    }

    public string OverwiteStringHybrid(string longstring)
    {
        var source = MemoryMarshal.AsBytes(MemoryMarshal.AsMemory(longstring.AsMemory()).Span);
        var s = new string('\0', longstring.Length);
        var target = MemoryMarshal.AsBytes(MemoryMarshal.AsMemory(s.AsMemory()).Span);

        Unsafe.CopyBlock(ref target[0], ref source[40 * sizeof(char)], 10 * sizeof(char));
        Unsafe.CopyBlock(ref target[10], ref source[30 * sizeof(char)], 10 * sizeof(char));
        Unsafe.CopyBlock(ref target[20], ref source[20 * sizeof(char)], 10 * sizeof(char));
        Unsafe.CopyBlock(ref target[30], ref source[10 * sizeof(char)], 10 * sizeof(char));
        Unsafe.CopyBlock(ref target[40], ref source[0], 10 * sizeof(char));

        return s;
    }

    public unsafe string UnsafeOverwrite(string longstring)
    {
        var s = new string('\0', longstring.Length);
        fixed (char* source = longstring)
        fixed (char* target = s)
        {
            Unsafe.CopyBlock(target, source + 40, 10 * sizeof(char));
            Unsafe.CopyBlock(target + 10, source + 30, 10 * sizeof(char));
            Unsafe.CopyBlock(target + 20, source + 20, 10 * sizeof(char));
            Unsafe.CopyBlock(target + 30, source + 10, 10 * sizeof(char));
            Unsafe.CopyBlock(target + 40, source, 10 * sizeof(char));
        }
        return s;
    }

    [Benchmark]
    public void ConcatSpansBenchmark()
        => ConcatSpans("aaaaaaaaaabbbbbbbbbbccccccccccddddddddddeeeeeeeeee");
    [Benchmark]
    public void OverwiteStringBenchmark()
    => OverwiteString("aaaaaaaaaabbbbbbbbbbccccccccccddddddddddeeeeeeeeee");
    [Benchmark]
    public void UnsafeOverwriteBenchmark()
    => UnsafeOverwrite("aaaaaaaaaabbbbbbbbbbccccccccccddddddddddeeeeeeeeee");

    [Benchmark]
    public void OverwiteStringHybridBenchmark()
    => OverwiteStringHybrid("aaaaaaaaaabbbbbbbbbbccccccccccddddddddddeeeeeeeeee");

    public static void Main(string[] args)
        => BenchmarkRunner.Run<Program>();
}

Note: in the general case - to get unsafe code from a slice:

with C# 7.3:

fixed(char* p = theSpan)
{
    ...
}

otherwise:

fixed(char* p = &MemoryMarshal.GetReference(theSpan))
{

}
Sacksen answered 6/11, 2018 at 22:11 Comment(9)
Great, thank you. I've just turned your advice into a StringBuilder implementation. See github.com/discostu105/stringbuilder-tests/blob/master/…Guttering
@Guttering nice - you should probably add a bounds check to Append, though, so it can't write past the end of the objectSacksen
Of course. I was just going for max perf. Added bound checks, so nobody can shoot themself in the foot when copying it :).Guttering
Avoid the string allocation and you will find true perf. Or just use our ValueStringBuilder implementation: github.com/dotnet/corefx/blob/master/src/Common/src/CoreLib/…Kuehnel
indeed, @ViktorHofer - that then gets into a much more complex discussion about pools, leases, arenas, things like IMemoryOwner<T>, but... yeah, the allocation you don't do is the fastest :)Sacksen
Thanks @ViktorHofer! ValueStringBuilder is pretty close to my FastBufferStringBuilder impl (github.com/discostu105/stringbuilder-tests/blob/master/…). Though it does not fully outperform the Unsafe.CopyBlock variation, it's close to it and works without unsafe!Guttering
ValueStringBuilder is actually exactly what I was looking for. Will it be made public any time soon?Guttering
What about string.Create?Febrific
@Febrific yes, I believe that would be preferred now; it wasn't available at the timeSacksen

© 2022 - 2024 — McMap. All rights reserved.