I've researched a bit and it seems that the common wisdom says that structs should be under 16 bytes because otherwise they incur a performance penalty for copying. With C#7 and ref return it became quite easy to completely avoid copying structs altogether. I assume that as the struct size gets smaller, passing by ref has more overhead that just copying the value.
Is there a rule of thumb about when passing structs by value becomes faster than by ref? What factors affect this? (Struct size, process bitness, etc.)
More context
I'm working on a game with the vast majority of data represented as contiguous arrays of structs for maximum cache-friendliness. As you might imagine, passing structs around is quite common in such a scenario. I'm aware that profiling is the only real way of determining the performance implications of something. However, I'd like to understand the theoretical concepts behind it and hopefully write code with that understanding in mind and profile only the edge cases.
Also, please note that I'm not asking about best practices or the sanity of passing everything by ref. I'm aware of "best practices" and implications and I deliberately choose not to follow them.
Addressing the "duplicate" tag
Performance of pass by value vs. pass by reference in C# .NET - This question discusses passing a reference type by ref which is completely different to what I'm asking.
In .Net, when if ever should I pass structs by reference for performance reasons? - The second question touches the subject a bit, but it's about a specific size of the struct.
To answer the questions from Eric Lippert's article:
Do you really need to answer that question? Yes I do. Because it'll affect how I write a lot of code.
Is that really the bottleneck? Probably not. But I'd still like to know since that's the data access pattern for 99% of the program. In my mind this is similar to choosing the correct data structure.
Is the difference relevant? It is. Passing large structs by ref is faster. I'm just trying to understand the limits of this.
What is this “faster” you speak of? As in giving less work to the CPU for the same task.
Are you looking at the big picture? Yes. As previously stated, it affects how I write the whole thing.
I know I could measure a lot of different combinations. And what does that tell me? That X is faster thatn Y on my combination of [.NET Version, process bitness, OS, CPU]. What about Linux? What about Android? What about iOS? Should I benchmark all permutations on all possible hardware/software combinations?
I don't think that's a viable strategy. Therefore I ask here where hopefully someone who knows a lot about CLR/JIT/ASM/CPU can tell me how that works so I can make informed decisions when writing code.
The answer I'm looking for is similar to the aforementioned 16 byte guideline for struct sizes with the explanation why.
ref
parameter will allow the recipient to act upon it in place, even if the element is a structure type. That's one of the big advantages of using arrays of structure types. – Wakeful