When is it more efficient to pass structs by value and when by ref in C#?
Asked Answered
S

4

43

I've researched a bit and it seems that the common wisdom says that structs should be under 16 bytes because otherwise they incur a performance penalty for copying. With C#7 and ref return it became quite easy to completely avoid copying structs altogether. I assume that as the struct size gets smaller, passing by ref has more overhead that just copying the value.

Is there a rule of thumb about when passing structs by value becomes faster than by ref? What factors affect this? (Struct size, process bitness, etc.)

More context

I'm working on a game with the vast majority of data represented as contiguous arrays of structs for maximum cache-friendliness. As you might imagine, passing structs around is quite common in such a scenario. I'm aware that profiling is the only real way of determining the performance implications of something. However, I'd like to understand the theoretical concepts behind it and hopefully write code with that understanding in mind and profile only the edge cases.

Also, please note that I'm not asking about best practices or the sanity of passing everything by ref. I'm aware of "best practices" and implications and I deliberately choose not to follow them.

Addressing the "duplicate" tag

Performance of pass by value vs. pass by reference in C# .NET - This question discusses passing a reference type by ref which is completely different to what I'm asking.

In .Net, when if ever should I pass structs by reference for performance reasons? - The second question touches the subject a bit, but it's about a specific size of the struct.

To answer the questions from Eric Lippert's article:

Do you really need to answer that question? Yes I do. Because it'll affect how I write a lot of code.

Is that really the bottleneck? Probably not. But I'd still like to know since that's the data access pattern for 99% of the program. In my mind this is similar to choosing the correct data structure.

Is the difference relevant? It is. Passing large structs by ref is faster. I'm just trying to understand the limits of this.

What is this “faster” you speak of? As in giving less work to the CPU for the same task.

Are you looking at the big picture? Yes. As previously stated, it affects how I write the whole thing.

I know I could measure a lot of different combinations. And what does that tell me? That X is faster thatn Y on my combination of [.NET Version, process bitness, OS, CPU]. What about Linux? What about Android? What about iOS? Should I benchmark all permutations on all possible hardware/software combinations?

I don't think that's a viable strategy. Therefore I ask here where hopefully someone who knows a lot about CLR/JIT/ASM/CPU can tell me how that works so I can make informed decisions when writing code.

The answer I'm looking for is similar to the aforementioned 16 byte guideline for struct sizes with the explanation why.

Schottische answered 19/9, 2017 at 4:28 Comment(20)
Which is faster?Hesperidium
@PeterDuniho - the other question is asking about passing a reference type by ref and my question is strictly about the size of structs. Also, as 90% of my data access patterns are affected by this I can't profile all the permutations of various data dristibutions.Schottische
There are two marked duplicates. The first includes some discussion of the value type scenario, while the second is entirely about that. As far as "I can't profile all the permutations of data dristibution" goes, you can profile the important permutations. Scenarios that don't come up often aren't worth optimizing for.Hesperidium
"passing structs around is quite common in such a scenario" -- why? putting things in an array helps only if you access the thing from the array. As soon as you copy it out into a variable, you're no longer taking advantage of the data the array populated the cache with. In any case, your question is purely speculative and too broad. There are too many things that can affect performance for anyone to be able to definitively tell you how large your structs can be without needing to pass by ref, especially since the design choice will affect things other than method calls.Hesperidium
The answers in the second question boil down to when to use a struct and when not to. I have a real-world scenario where almost ALL of my data is represented as structs of widely varying sizes.Schottische
You're obviously completely missing the point. I'm NOT copying anything because I'm passing and returning by ref.Schottische
"The answers in the second question boil down to when to use a struct and when not to" -- and so does your question. If you're always passing by ref, why are you asking the question? If you're trying to compare passing by ref with not passing by ref, then the scenario where you're not passing by ref involves copying the data. Either way, your question remains too broad.Hesperidium
What's broad about the part marked as bold in my question? I'm certain it can be unequivocally answered in 2 sentences by someone who understands the inner workings of CLR and the JITter. The rest of the question was meant to describe how my scenario is different than the ones in questions you linked for example.Schottische
You think two sentences would cover "what factors affect this?" Sorry, if you really believe that, you have seriously underestimated the complexities of performance tuning. Again, I refer you to Which is faster?. No answer within the intended scope of Stack Overflow is going to adequately cover your question.Hesperidium
That is just your opinion. Since I'd still like to get my answer - what are my options? How can I get someone who understands more about the topic to see this question now that's closed? Edit it? Flag it?Schottische
This is my thought process approaching the question: There are two factors affecting performance. The first is where to allocate memory. The second is how you pass data to a method. If only considering the parameter passing performance, passing by value will never be faster than passing by reference, unless the struct you are passing is smaller than size of reference. You lose performance passing by value, But you gain performance by allocating memory on stack. How much you can lose is determined by how much you can gain. How much you can gain is a much bigger topic though.Gaul
@PeterDuniho: Passing an array element as a ref parameter will allow the recipient to act upon it in place, even if the element is a structure type. That's one of the big advantages of using arrays of structure types.Wakeful
@supercat: "Passing an array element as a ref parameter will allow the recipient to act upon it in place" -- I'm well aware of that. So what? Since the question is asking to compare passing by value with passing by reference, one must assume that there is no need to modify the value. Otherwise, it wouldn't even be a question, because passing by reference would be the only option.Hesperidium
@PeterDuniho: I was referring to your statement about how having structures in an array is only useful when accessing things from the array. It wasn't clear whether you were counting accesses through a byref as accesses from the array.Wakeful
@PeterDuniho: "one must assume that there is no need to modify the value" -- what are you talking about? You can copy it from the array, modify it and copy it back.Schottische
@loodakrawa: "You can copy it from the array, modify it and copy it back" -- you could. But that would be pretty silly to do that if passing by reference was already under consideration anyway. Especially if one hasn't bothered to do any actual performance testing to see if there's some benefit to completely ignoring the semantics of the operation as a basis of design.Hesperidium
That is the whole point. When the structs are small enough, copying is faster. I'm trying to understand what determines that limit - probably the way the CLR/JIT handles refs. Performance testing doesn't really tell me WHY which essentially is my question. See edits.Schottische
"why" questions are hard to answer. If your question is "what machine code is generated by the jitter for a copy by ref vs a copy by value?" then use the debugger to look at the machine code that is generated for your particular scenario.Asynchronism
This is actually a relevant question. I see it as totally idiotic for 99% of all programs - but if you do an inner game loop, or i.e. a ticker plant for a trading backend, this is the type of issue that really come up and MAKES A DIFFERENCE. It gets even more relevant if you take span into account so you can move around views / parts of an array without copying. I personally have some programs where the core loop is about 3 pages of code, using 95% of the processing time and - runs loops updating values in an array that are represented as structs for performance reasons. Good question.Idel
It's not the answer you are looking for but you may be interested in this question: #2438425 plus my own basic investigation here: forum.unity.com/threads/opinions-about-tokenizing.362531/… The take away for me was that this question is better left unanswered. Not because nobody knows but because it is an implementation detail subject to getting changed over time and environment.Famished
S
5

I finally found the answer. The breaking point is System.IntPtr.Size. In Microsoft's own words from Write safe and efficient C# code:

Add the in modifier to pass an argument by reference and declare your design intent to pass arguments by reference to avoid unnecessary copying. You don't intend to modify the object used as that argument.

This practice often improves performance for readonly value types that are larger than IntPtr.Size. For simple types (sbyte, byte, short, ushort, int, uint, long, ulong, char, float, double, decimal and bool, and enum types), any potential performance gains are minimal. In fact, performance may degrade by using pass-by-reference for types smaller than IntPtr.Size.

Schottische answered 21/9, 2020 at 8:20 Comment(0)
M
6

generally, passing by reference should be faster.
when you pass a struct by reference, you are only passing a pointer to the struct, which is only a 32/64 bit integer.
when you pass a struct by value, you need to copy the entire struct and then pass a pointer to the new copy.
unless the struct is very small, for example, an int, passing by reference is faster.

also, passing by value would increase the number of calls to the os for memory allocation and de-allocation, these calls are time-consuming as the os has to check a registry for available space.

Manes answered 11/3, 2020 at 13:45 Comment(11)
To clarify, in this case "very small" = "smaller than a pointer on the device"Brantbrantford
@Brantbrantford yes, though in certain circumstances you would still want to pass by value to make your code easier to work with, though that REALLY depends on the situationManes
Needs more empirical data. Passing “a pointer” requires an additional indirect so (and depending on implementation) there is some cut-over point beyond this simple generality. What is that inflection point, and where? How/why does it differ across environment?Steak
One other consideration: does passing a struct by ref cause the struct to be boxed/unboxed? If so, that would increase the cost of passing by ref. Then a larger size threshold might be appropriate for passing by ref instead of by value.Gisele
@Gisele According to Microsoft, "There is no boxing of a value type when it is passed by reference."Gisele
@Steak in standard implementations you pass a pointer to the struct whether it's ref or value, you don't pass a pointer to the pointer as that would be needless indirection and any good compiler would not do thatManes
"passing by value would increase the number of calls to the os for memory allocation and de-allocation" - that sounds very wrong. Do you have a source for that? I would expect it to use the stack memory with no need for any extra allocations.Schottische
@Schottische in many cases, for example, structs that are array elements, structs will be heap allocated ,in this case, the structs are in arrays. though I agree that that doesn't always apply. also, stack/heap allocation is implementation dependent, so it's better to not rely on that.Manes
Well, I agree, but that has nothing to do with passing structs around. Once you allocate an array of structs there's no more heap allocations no matter how you access the data in the array - either via ref or copying. And you answer states that passing by value would allocate additional memorySchottische
Also, as @Steak said - I'm asking about the inflection point. 1 int is faster, I agree. What about 2 ints? 3? 4?Schottische
@Schottische the inflection point is implementation and hardware dependent, it depends on the efficiency of the garbage collector, speed of different operations etc... in general I would say it should be below 12 ints but for the inflection point you need to do tests.Manes
S
5

I finally found the answer. The breaking point is System.IntPtr.Size. In Microsoft's own words from Write safe and efficient C# code:

Add the in modifier to pass an argument by reference and declare your design intent to pass arguments by reference to avoid unnecessary copying. You don't intend to modify the object used as that argument.

This practice often improves performance for readonly value types that are larger than IntPtr.Size. For simple types (sbyte, byte, short, ushort, int, uint, long, ulong, char, float, double, decimal and bool, and enum types), any potential performance gains are minimal. In fact, performance may degrade by using pass-by-reference for types smaller than IntPtr.Size.

Schottische answered 21/9, 2020 at 8:20 Comment(0)
T
4

If you pass around structs by reference then they can be of any size. You are still dealing with a 8 (x64 assumed) byte pointer. For highest performance you need a CPU cache friendly design which is is called Data Driven Design.

Games often use a special Data Driven Design called Entity Component System. See the book Pro .NET Memory Management by Konrad Kokosa Chapter 14.

The basic idea is that you can update your game entities which are e.g. Movable, Car, Plane, ... share common properties like a position which is for all entities stored in a contigous array. If you need to increment the position of 1K entities you just need to lookup the array index of the position array of all entities and update them there. This provides the best possible data locality. If all would be stored in classes the CPU prefetcher would be lost by the many this pointers for each class instance.

See this Intel post about some reference architecture: https://software.intel.com/en-us/articles/get-started-with-the-unity-entity-component-system-ecs-c-sharp-job-system-and-burst-compiler

There are plenty of Entity Component Systems out there but so far I have seen none using ref structs as their main working data structure. The reason is that all popular ones are existing much longer than C# 7.2 where ref structs were introduced.

Traprock answered 13/3, 2020 at 22:32 Comment(4)
Implementing an ECS as the core architecture is precisely the reason why I asked the question in the first place. Anyway, if the data is not stored as an array of structs but as an array of classes, then it's likely that the data will be spread across the heap and not contiguous in any way because an array of classes is effectively an array of pointers to the heapSchottische
@loodakrawa: Did you complete your ECS system with C# 7.2 features? Is it open source? Would be interesting how yours compares to other like Entitas.Traprock
I implemented it ~3 years ago so I didn't use the 7.2 features. I stopped trying to make games since but I'm getting into it again now so I'll probably upgrade it with the new goodies. Anyway, you can check it out here: github.com/loodakrawa/ScatteredLogicSchottische
Here, open-source ECS framework with ref structs: LeoECS. Fast and not only for games. Enjoy that speed =)Scorpaenoid
D
0

Great answer. But I think the breaking point cannot be System.IntPtr.Size, at least not in all scenarios. I think there are two categories here (assuming 64 bit words):

  1. Most code

Define structs up to 3 words (24 bytes). Do not bother with passing by reference. For example System.Guid and System.Decimal are 2 words. Latest Framework Design Guidelines recommendation for class vs struct:

CONSIDER defining a struct instead of a class if instances of the type are small and commonly short-lived or are commonly embedded in other objects, especially arrays.

AVOID defining a struct unless the type has all of the following characteristics:

  • It logically represents a single value, similar to primitive types (int, double, etc.).
  • It has an instance size less than 24 bytes.
  • It is immutable.
  • It will not have to be boxed frequently.
  1. Performance sensitive.

For this category you just have to measure. Theoretically passing even a Decimal by value may incur some overhead. Microsoft mentions neglible cost for copying 3 words or less:

The cost of copying a value is negligible if the types are small, three words or less (considering one word being of natural size of one integer). It's measurable and can have real performance impact for larger types.

Desberg answered 9/5 at 19:41 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.