Concatenate ReadOnlySpan<char>
Asked Answered
U

8

20

Ok, .NET Core 2.1 has landed. With it we've gotten a new way to work with string data being ReadOnlySpan<char>. It's great at splitting string data, but what about combining the spans back together?

var hello = "Hello".AsSpan();
var space = " ".AsSpan();
var world = "World".AsSpan();

var result = ...; // How do I get "Hello World" out of the 3 above?
Unclinch answered 31/5, 2018 at 23:12 Comment(6)
Spans generally refer to pre-existing memory. To do what you want, you'd essentially need to allocate a new string or char-array, and then overwrite them (yes, you can overwrite string - immutability is a lie) using the source spans. Concatenation isn't readily built in AFAIK.Chilton
@MarcGravell Is there any posts or articles talking about the overwriting of string?Usanis
How about writing your own wrapper that encapsulates list of span and overrides the indexer?Probability
@OnurGumus to get a coherent API I need the result be either ReadOnlySpan<char> or stringUnclinch
@Unclinch The documentation is sparse probably (I think it is undocumented)... But for example you can take a look at string.Copy that modifies directly the internal value of the string using Buffer.Memcpy through wstrcpy. The important thing is that the string you modify must be newly allocated (with the new String('\0', length) constructor for example, or another constructor, or the .Copy()). See for example hereAzov
@Usanis in the context of spans, you can trivially force a read-only span into a read-write span (read-only is also a lie); then you're already done - just mutate away. Prior to spans: unsafe is your friend fixed(char* ptr = theString) { ptr[3] = 'f'; }Chilton
P
13

I think it's worth mentioning that an overload for concatenating spans was added in .NET Core 3 and that support for .NET Core 2.1 ended on August 21, 2021 [src]. If you upgrade now then you can simply use String.Concat.

https://learn.microsoft.com/en-us/dotnet/api/system.string.concat?view=netcore-3.1#System_String_Concat_System_ReadOnlySpan_System_Char__System_ReadOnlySpan_System_Char__System_ReadOnlySpan_System_Char__

var hello = "Hello".AsSpan();
var space = " ".AsSpan();
var world = "World".AsSpan();

// .NET Core 3+
var result = string.Concat(hello, space, world);
Pricilla answered 18/8, 2020 at 12:55 Comment(0)
S
12

Here's an example of how the .NET team internally handles this for Path.Join:

private static unsafe string JoinInternal(ReadOnlySpan<char> first, ReadOnlySpan<char> second)
{
    Debug.Assert(first.Length > 0 && second.Length > 0, "should have dealt with empty paths");

    bool hasSeparator = PathInternal.IsDirectorySeparator(first[first.Length - 1])
        || PathInternal.IsDirectorySeparator(second[0]);

    fixed (char* f = &MemoryMarshal.GetReference(first), s = &MemoryMarshal.GetReference(second))
    {
        return string.Create(
            first.Length + second.Length + (hasSeparator ? 0 : 1),
            (First: (IntPtr)f, FirstLength: first.Length, Second: (IntPtr)s, SecondLength: second.Length, HasSeparator: hasSeparator),
            (destination, state) =>
            {
                new Span<char>((char*)state.First, state.FirstLength).CopyTo(destination);
                if (!state.HasSeparator)
                    destination[state.FirstLength] = PathInternal.DirectorySeparatorChar;
                new Span<char>((char*)state.Second, state.SecondLength).CopyTo(destination.Slice(state.FirstLength + (state.HasSeparator ? 0 : 1)));
            });
    }
}

If you'd like to avoid using unsafe and use something that's maybe easier to read, you could use something like:

public static ReadOnlySpan<char> Concat(this ReadOnlySpan<char> first, ReadOnlySpan<char> second)
{
    return new string(first.ToArray().Concat(second.ToArray()).ToArray()).AsSpan();
}

public static ReadOnlySpan<char> Concat(this string first, ReadOnlySpan<char> second)
{
    return new string(first.ToArray().Concat(second.ToArray()).ToArray()).ToArray();
}

Using ReadOnlySpan is pretty low level and optimized for speed and so how you do it will likely depend on your own situation. But in many situations, it's probably fine to go back to string interpolation and StringBuilder (or don't convert to ReadOnlySpan at all). So

var sb = new StringBuilder();
return sb
    .Append(hello)
    .Append(space)
    .Append(world)
    .ToString();

or

return $"{hello.ToString()}{space.ToString()}{world.ToString()}";
Staffard answered 27/7, 2018 at 17:2 Comment(0)
D
9

You can acheive that with a buffer like this =>

var hello = "Hello".AsSpan();
var space = " ".AsSpan();
var world = "World".AsSpan();

// First allocate the buffer with the target size
char[] buffer = new char[hello.Length + space.Length + world.Length];
// "Convert" it to writable Span<char>
var span = new Span<char>(buffer);

// Then copy each span at the right position in the buffer
int index = 0;
hello.CopyTo(span.Slice(index, hello.Length));
index += hello.Length;

space.CopyTo(span.Slice(index, space.Length));
index += space.Length;

world.CopyTo(span.Slice(index, world.Length));

// Finality get back the string
string result = span.ToString();

You can optimize one again by using an arraypool to reuse the buffer

char[] buffer =  ArrayPool<char>.Shared.Rent(hello.Length + space.Length + world.Length);
// ...
ArrayPool<char>.Shared.Return(buffer);
Delicatessen answered 31/8, 2018 at 22:29 Comment(2)
ArrayPool does not give any benefits. In fact it is slower (for these strings) and allocates additional 8 bytes.Unclinch
True, but in a long lived application it makes sense to use an ArrayPool so you don't have to reallocate those bytes over and over. It can make a significant difference depending upon the app.Demeanor
E
4

I wrote and use use the following extension/tool methods to concatenate spans:

   ReadOnlySpan<T> otherSpan = ...
   T[] someArray = ...

   var myArray = someSpan.Concat(otherSpan, someArray, etc);
   var myArray2 = SpanTool.Concat(someArray, otherSpan, etc); 

SpanExtensions.cs

    public static class SpanExtensions {

        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public static T[] Concat<T>(this ReadOnlySpan<T> span0, ReadOnlySpan<T> span1)
            => SpanTool.Concat(span0, span1);

        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public static T[] Concat<T>(this ReadOnlySpan<T> span0, ReadOnlySpan<T> span1, ReadOnlySpan<T> span2)
            => SpanTool.Concat(span0, span1, span2);

        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public static T[] Concat<T>(this ReadOnlySpan<T> span0, ReadOnlySpan<T> span1, ReadOnlySpan<T> span2, ReadOnlySpan<T> span3)
            => SpanTool.Concat(span0, span1, span2, span3);

        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public static T[] Concat<T>(this ReadOnlySpan<T> span0, ReadOnlySpan<T> span1, ReadOnlySpan<T> span2, ReadOnlySpan<T> span3, ReadOnlySpan<T> span4)
            => SpanTool.Concat(span0, span1, span2, span3, span4);

    }

SpanTool.cs


 public static class SpanTool {

        public static T[] Concat<T>(ReadOnlySpan<T> span0) {
            var result = new T[span0.Length];
            span0.CopyTo(result);
            return result;
        }

        public static T[] Concat<T>(ReadOnlySpan<T> span0, ReadOnlySpan<T> span1) {
            var result = new T[span0.Length + span1.Length];
            var resultSpan = result.AsSpan();
            span0.CopyTo(result);
            var from = span0.Length;
            span1.CopyTo(resultSpan.Slice(from));
            return result;
        }

        public static T[] Concat<T>(ReadOnlySpan<T> span0, ReadOnlySpan<T> span1, ReadOnlySpan<T> span2) {
            var result = new T[span0.Length + span1.Length + span2.Length];
            var resultSpan = result.AsSpan();
            span0.CopyTo(result);
            var from = span0.Length;
            span1.CopyTo(resultSpan.Slice(from));
            from += span1.Length;
            span2.CopyTo(resultSpan.Slice(from));
            return result;
        }

        public static T[] Concat<T>(ReadOnlySpan<T> span0, ReadOnlySpan<T> span1, ReadOnlySpan<T> span2, ReadOnlySpan<T> span3) {
            var result = new T[span0.Length + span1.Length + span2.Length + span3.Length];
            var resultSpan = result.AsSpan();
            span0.CopyTo(result);
            var from = span0.Length;
            span1.CopyTo(resultSpan.Slice(from));
            from += span1.Length;
            span2.CopyTo(resultSpan.Slice(from));
            from += span2.Length;
            span3.CopyTo(resultSpan.Slice(from));
            return result;
        }

        public static T[] Concat<T>(ReadOnlySpan<T> span0, ReadOnlySpan<T> span1, ReadOnlySpan<T> span2, ReadOnlySpan<T> span3, ReadOnlySpan<T> span4) {
            var result = new T[span0.Length + span1.Length + span2.Length + span3.Length + span4.Length];
            var resultSpan = result.AsSpan();
            span0.CopyTo(result);
            var from = span0.Length;
            span1.CopyTo(resultSpan.Slice(from));
            from += span1.Length;
            span2.CopyTo(resultSpan.Slice(from));
            from += span2.Length;
            span3.CopyTo(resultSpan.Slice(from));
            from += span3.Length;
            span4.CopyTo(resultSpan.Slice(from));
            return result;
        }
    }

DISCLAIMER I just wrote this now for my project after the answers here were unsatisfactory. If any bugs, will edit later.

Emetine answered 23/6, 2020 at 1:14 Comment(1)
All these allocate arrays for the result - which is to be expected since [ReadOnly]Span only appears to be for contiguous memory. I wonder if there's a Span-like type for discontiguous regions of memory.Ignescent
H
3

Another option is to use string.Concat, which accepts ReadOnlySpan as parameters. Here is the implementation taken from github

internal static unsafe string Concat(ReadOnlySpan<char> str0, ReadOnlySpan<char> str1, ReadOnlySpan<char> str2)
{
    var result = new string('\0', checked(str0.Length + str1.Length + str2.Length));
    fixed (char* resultPtr = result)
    {
        var resultSpan = new Span<char>(resultPtr, result.Length);

        str0.CopyTo(resultSpan);
        resultSpan = resultSpan.Slice(str0.Length);

        str1.CopyTo(resultSpan);
        resultSpan = resultSpan.Slice(str1.Length);

        str2.CopyTo(resultSpan);
    }
    return result;
}

https://github.com/dotnet/runtime/blob/4f9ae42d861fcb4be2fcd5d3d55d5f227d30e723/src/libraries/Microsoft.IO.Redist/src/Microsoft/IO/StringExtensions.cs

Hydrocephalus answered 21/2, 2020 at 1:58 Comment(1)
Note that this implementation uses the unsafe keyword and you can only use it by allowing unsafe code in the Properties page of your project (or by setting AllowUnsafeBlocks to true in your csproj file). That's usually not a good idea for business applications.Pricilla
F
1

By using nearly all solution provided as answers here, defeats readonlymemory and span's purpose: to not allocate/copy memory unless necessary. This article explains in-depth how to join two or more spans into a sequence reader: https://www.stevejgordon.co.uk/creating-a-readonlysequence-from-array-data-in-dotnet. Steve's post shows that it may be overkill and a Pipe/PipeReader/PipeWriter would suffice in nearly all common scenarios.

My scenario uses a IEnumerable<ReadOnlyMemory> to simulate a SequenceReader. Please note that I can't go back after read, this is possible in both SequenceReader and Pipe.

My sample code:

// those strings are allocated only once and serves as example to reuse them in code.
private static ReadOnlyMemory<char> space = new[]{ ' ' };
private static ReadOnlyMemory<char> hello = "Hello".AsMemory();
private static ReadOnlyMemory<char> world = "World".AsMemory();
public IEnumerable<ReadOnlyMemory<char>> PrintHelloWorld(string concatValue)
{
    yield return hello;
    yield return space;
    yield return concatValue.AsMemory(); // because we receive string, we need to convert it to memory. That will copy string to a memory object containing its data. This will allocate string's length in memory, plus the struct size, all of them in stack.
    yield return space;
    yield return world;
}

public void PrintToConsole()
{
    foreach(var part in PrintHelloWorld("Stackoverflow"))
    {
        Console.Write(part);
    }

}

That works in a lot of scenarios (reading/writing streams mainly).

Fatsoluble answered 7/2, 2023 at 12:51 Comment(0)
P
1

OK, .NET 8 has landed. Using Collection expressions this becomes trivial.

ReadOnlySpan<char> result = [..hello, ..space, ..world];
Pluri answered 2/6 at 10:56 Comment(0)
E
-5

My quick Friday afternoon thoughts:

var hello = "Hello";
var helloS = hello.AsSpan();
var spaceS = " ".AsSpan();
var worldS = "World".AsSpan();

var sentence = helloS.ToString() + spaceS.ToString() + worldS.ToString();

//Gives "Hello World"

At least according my quick play on LinqPad, and a quick read of the System.Memory Source

Ensiform answered 23/11, 2018 at 9:6 Comment(1)
Context for why this is a bad solution: The idea of using Spans is to reduce the heap allocations. You lose this advantage by converting them to strings. However, this works.Characteristic

© 2022 - 2024 — McMap. All rights reserved.