Practical use of `stackalloc` keyword
Asked Answered
B

6

194

Has anyone ever actually used stackalloc while programming in C#? I am aware of what is does, but the only time it shows up in my code is by accident, because Intellisense suggests it when I start typing static, for example.

Although it is not related to the usage scenarios of stackalloc, I actually do a considerable amount of legacy interop in my apps, so every now and then I could resort to using unsafe code. But nevertheless I usually find ways to avoid unsafe completely.

And since stack size for a single thread in .Net is ~1Mb (correct me if I'm wrong), I am even more reserved from using stackalloc.

Are there some practical cases where one could say: "this is exactly the right amount of data and processing for me to go unsafe and use stackalloc"?

Bannister answered 24/4, 2009 at 9:51 Comment(1)
just noticed that System.Numbers uses it a lot referencesource.microsoft.com/#mscorlib/system/…Puli
H
216

The sole reason to use stackalloc is performance (either for computations or interop). By using stackalloc instead of a heap allocated array, you create less GC pressure (the GC needs to run less), you don't need to pin the arrays down, it's faster to allocate than a heap array, an it is automatically freed on method exit (heap allocated arrays are only deallocated when GC runs). Also by using stackalloc instead of a native allocator (like malloc or the .Net equivalent) you also gain speed and automatic deallocation on scope exit.

Performance wise, if you use stackalloc you greatly increase the chance of cache hits on the CPU due to the locality of data.

Halley answered 24/4, 2009 at 10:8 Comment(4)
Locality of data, good point! That's what managed memory will rarely achieve when you want to allocate several structures or arrays. Thanks!Bannister
Heap allocations are usually faster for managed objects than for unmanaged because there's no free list to traverse; the CLR just increments the heap pointer. As for locality, sequential allocations are more likely to end up colocated for long running managed processes because of heap compaction.Emissive
"it's faster to allocate than a heap array" Why's that? Just locality? Either way it's just a pointer-bump, no?Curtilage
@MaxBarraclough Because you add the GC cost to heap allocations over the application lifetime. Total allocation cost = allocation + deallocation, in this case pointer bump + GC Heap, vs pointer bump + pointer decrement StackHalley
S
49

I have used stackalloc to allocate buffers for [near] realtime DSP work. It was a very specific case where performance needed to be as consistent as possible. Note there is a difference between consistency and overall throughput - in this case I wasn't concerned with heap allocations being too slow, just with the non determinism of garbage collection at that point in the program. I wouldn't use it in 99% of cases.

Supra answered 24/4, 2009 at 10:26 Comment(0)
P
43

Stackalloc initialization of spans. In previous versions of C#, the result of stackalloc could only be stored into a pointer local variable. As of C# 7.2, stackalloc can now be used as part of an expression and can target a span, and that can be done without using the unsafe keyword. Thus, instead of writing

Span<byte> bytes;
unsafe
{
  byte* tmp = stackalloc byte[length];
  bytes = new Span<byte>(tmp, length);
}

You can write simply:

Span<byte> bytes = stackalloc byte[length];

This is also extremely useful in situations where you need some scratch space to perform an operation, but want to avoid allo­cating heap memory for relatively small sizes

Span<byte> bytes = length <= 128 ? stackalloc byte[length] : new byte[length];
... // Code that operates on the Span<byte>

Source: C# - All About Span: Exploring a New .NET Mainstay

Porch answered 9/12, 2018 at 11:18 Comment(3)
Thanks for the tip. It seems each new version of C# gets a bit closer to C++, which is actually a good thing IMHO.Bannister
As can be seen here and here, Span is alas not available in .NET framework 4.7.2 and even not in 4.8... So that new language feature is still of limited use for the moment.Winer
@Frederic: apparently, .NET framework 4.8 is the last major version of .NET framework. Next version of .NET Core will be called .NET 5 (no "core" anymore, and version 4 skipped to avoid confusion with the framework), so the future of .NET is .NET Core, and it's now evident that .NET Framework apps won't get this update. I also believe the idea is to remove .NET Standard in the future (since there will be just ".NET" from this point on).Bannister
H
26

stackalloc is only relevant for unsafe code. For managed code you can't decide where to allocate data. Value types are allocated on the stack per default (unless they are part of a reference type, in which case they are allocated on the heap). Reference types are allocated on the heap.

The default stack size for a plain vanilla .NET application is 1 MB, but you can change this in the PE header. If you're starting threads explicitly, you may also set a different size via the constructor overload. For ASP.NET applications the default stack size is only 256K, which is something to keep in mind if you're switching between the two environments.

Hepatica answered 24/4, 2009 at 10:3 Comment(3)
Is it possible to change the default stack size from Visual Studio?Gerbil
@configurator: Not as far as I am aware.Hepatica
This is no longer the case with C# 7.2's Span<T> and ReadOnlySpan<T>. @Anth pointed this out in an answer below.Horribly
C
11

Late answer but I believe still helpful.

I came to this question and I was still curios to see the performance difference so I created the following benchmark (used BenchmarkDotNet NuGet Package):

[MemoryDiagnoser]
[Orderer(SummaryOrderPolicy.FastestToSlowest)]
[RankColumn]
public class Benchmark1
{
    //private MemoryStream ms = new MemoryStream();

    static void FakeRead(byte[] buffer, int start, int length)
    {
        for (int i = start; i < length; i++)
            buffer[i] = (byte) (i % 250);
    }

    static void FakeRead(Span<byte> buffer)
    {
        for (int i = 0; i < buffer.Length; i++)
            buffer[i] = (byte) (i % 250);
    }

    [Benchmark]
    public void AllocatingOnHeap()
    {
        var buffer = new byte[1024];
        FakeRead(buffer, 0, buffer.Length);
    }

    [Benchmark]
    public void ConvertingToSpan()
    {
        var buffer = new Span<byte>(new byte[1024]);
        FakeRead(buffer);
    }

    [Benchmark]
    public void UsingStackAlloc()
    {
        Span<byte> buffer = stackalloc byte[1024];
        FakeRead(buffer);
    }
}

And this where the results

|           Method |     Mean |    Error |   StdDev | Rank |  Gen 0 | Allocated |
|----------------- |---------:|---------:|---------:|-----:|-------:|----------:|
|  UsingStackAlloc | 704.9 ns | 13.81 ns | 12.91 ns |    1 |      - |         - |
| ConvertingToSpan | 755.8 ns |  5.77 ns |  5.40 ns |    2 | 0.0124 |   1,048 B |
| AllocatingOnHeap | 839.3 ns |  4.52 ns |  4.23 ns |    3 | 0.0124 |   1,048 B |

This benchmark shows that using stackalloc is the fastest solution and also it uses no allocations! If you are curios how to use the NuGet Package BenchmarkDotNet then watch this video.

Cathouse answered 25/3, 2022 at 15:4 Comment(3)
Probably it indicates that BenchmarkDotNet does not measure stackalloc alloctions.Rusk
Stack allocations are impactless, inconsequential, and irrelevant. You would not want any tool to measure them even if it could.Tactile
FYI, the runtime zeros the memory allocation. Add [System.Runtime.CompilerServices.SkipLocalsInit] to make it simply incrementy the stack pointer. For the other methods, doing so will just mean the variable pointing to the arrays won't be zero initialized upon entry.Gilbertogilbertson
A
8

There are some great answers in this question but I just want to point out that

Stackalloc can also be used to call native APIs

Many native functions requires the caller to allocate a buffer to get the return result. For example, the CfGetPlaceholderInfo function in cfapi.h has the following signature.

HRESULT CfGetPlaceholderInfo(
HANDLE                    FileHandle,
CF_PLACEHOLDER_INFO_CLASS InfoClass,
PVOID                     InfoBuffer,
DWORD                     InfoBufferLength,
PDWORD                    ReturnedLength);

In order to call it in C# through interop,

[DllImport("Cfapi.dll")]
public static unsafe extern HResult CfGetPlaceholderInfo(IntPtr fileHandle, uint infoClass, void* infoBuffer, uint infoBufferLength, out uint returnedLength);

You can make use of stackalloc.

byte* buffer = stackalloc byte[1024];
CfGetPlaceholderInfo(fileHandle, 0, buffer, 1024, out var returnedLength);
Altdorfer answered 14/10, 2019 at 15:24 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.