Following the discussions here on SO I already read several times the remark that mutable structs are “evil” (like in the answer to this question).
What's the actual problem with mutability and structs in C#?
Following the discussions here on SO I already read several times the remark that mutable structs are “evil” (like in the answer to this question).
What's the actual problem with mutability and structs in C#?
Structs are value types which means they are copied when they are passed around.
So if you change a copy you are changing only that copy, not the original and not any other copies which might be around.
If your struct is immutable then all automatic copies resulting from being passed by value will be the same.
If you want to change it you have to consciously do it by creating a new instance of the struct with the modified data. (not a copy)
public ExposedFieldHolder<T> { public T Value; public ExposedFieldHolder(T v) { Value = v; } }
then one could use it to turn any structure into an entity (making the type public is harmless, since the whole purpose of the type is to have exactly the indicated semantics, and exposing it would not expose any instances to code that shouldn't see them). –
Willardwillcox System.Drawing
for example (Point, Rectangle, ...). –
Incursion Where to start ;-p
Eric Lippert's blog is always good for a quote:
This is yet another reason why mutable value types are evil. Try to always make value types immutable.
First, you tend to lose changes quite easily... for example, getting things out of a list:
Foo foo = list[0];
foo.Name = "abc";
what did that change? Nothing useful...
The same with properties:
myObj.SomeProperty.Size = 22; // the compiler spots this one
forcing you to do:
Bar bar = myObj.SomeProperty;
bar.Size = 22;
myObj.SomeProperty = bar;
less critically, there is a size issue; mutable objects tend to have multiple properties; yet if you have a struct with two int
s, a string
, a DateTime
and a bool
, you can very quickly burn through a lot of memory. With a class, multiple callers can share a reference to the same instance (references are small).
++
operator. In this case, the compiler just writes the explicit assignment itself instead of hustling the programmer. –
Cymene test.Bar++
or test.Bar += 1
. This is also mutating the value – in fact, it’s more or less equivalent to var tmp = test.Bar; tmp += 1; test.Bar = tmp;
. So the situation is not completely identical but that was never my point. My point was that the compiler can and does rewrite code where it makes sense. And the resulting rewrite here is almost completely identical to the manual rewrite you proposed in your answer. The only thing that’s different is bar.Size = 22;
vs. bar = bar + 1
. –
Cymene myObj.SomeProperty.Size =
modify a copy of myObj.SomeProperty
but bar.Size =
modify bar
and not a copy of bar
? That is where the poor decision seems to be... –
Blowhole SomeProperty
is not actually a property (perhaps it is a field?), or the type of SomeProperty
is not actually a struct
. Here's a minimal repro that shows CS1612: sharplab.io/… –
Untutored I wouldn't say evil but mutability is often a sign of overeagerness on the part of the programmer to provide a maximum of functionality. In reality, this is often not needed and that, in turn, makes the interface smaller, easier to use and harder to use wrong (= more robust).
One example of this is read/write and write/write conflicts in race conditions. These simply can't occur in immutable structures, since a write is not a valid operation.
Also, I claim that mutability is almost never actually needed, the programmer just thinks that it might be in the future. For example, it simply doesn't make sense to change a date. Rather, create a new date based off the old one. This is a cheap operation, so performance is not a consideration.
float
components of a graphics transform. If such a method returns an exposed-field struct with six components, it's obvious that modifying the fields of the struct won't modify the graphics object from which it was received. If such a method returns a mutable class object, maybe changing its properties will change the underlying graphics object and maybe it won't--nobody really knows. –
Willardwillcox Rectangle
property which could be my screen size, and I would like to change it outside of the class. –
Artemisa Mutable structs are not evil.
They are absolutely necessary in high performance circumstances. For example when cache lines and or garbage collection become a bottleneck.
I would not call the use of a immutable struct in these perfectly valid use-cases "evil".
I can see the point that C#'s syntax does not help to distinguish the access of a member of a value type or of a reference type, so I am all for preferring immutable structs, that enforce immutability, over mutable structs.
However, instead of simply labelling immutable structs as "evil", I would advise to embrace the language and advocate more helpful and constructive rule of thumbs.
For example: "structs are value types, which are copied by default. you need a reference if you don't want to copy them" or "try to work with readonly structs first".
struct
with public fields) than to define a class which can be used, clumsily, to achieve the same ends, or to add a bunch of junk to a struct to make it emulate such a class (rather than having it behave like a set of variables stuck together with duct tape, which is what one really wants in the first place) –
Willardwillcox Structs with public mutable fields or properties are not evil.
Struct methods (as distinct from property setters) which mutate "this" are somewhat evil, only because .net doesn't provide a means of distinguishing them from methods which do not. Struct methods that do not mutate "this" should be invokable even on read-only structs without any need for defensive copying. Methods which do mutate "this" should not be invokable at all on read-only structs. Since .net doesn't want to forbid struct methods that don't modify "this" from being invoked on read-only structs, but doesn't want to allow read-only structs to be mutated, it defensively copies structs in read-only contexts, arguably getting the worst of both worlds.
Despite the problems with the handling of self-mutating methods in read-only contexts, however, mutable structs often offer semantics far superior to mutable class types. Consider the following three method signatures:
struct PointyStruct {public int x,y,z;}; class PointyClass {public int x,y,z;}; void Method1(PointyStruct foo); void Method2(ref PointyStruct foo); void Method3(PointyClass foo);
For each method, answer the following questions:
Answers:
Question 1:
Method1()
: no (clear intent)
Method2()
: yes (clear intent)
Method3()
: yes (uncertain intent)
Question 2:
Method1()
: no
Method2()
: no (unless unsafe)
Method3()
: yes
Method1 can't modify foo, and never gets a reference. Method2 gets a short-lived reference to foo, which it can use modify the fields of foo any number of times, in any order, until it returns, but it can't persist that reference. Before Method2 returns, unless it uses unsafe code, any and all copies that might have been made of its 'foo' reference will have disappeared. Method3, unlike Method2, gets a promiscuously-sharable reference to foo, and there's no telling what it might do with it. It might not change foo at all, it might change foo and then return, or it might give a reference to foo to another thread which might mutate it in some arbitrary way at some arbitrary future time. The only way to limit what Method3 might do to a mutable class object passed into it would be to encapsulate the mutable object into a read-only wrapper, which is ugly and cumbersome.
Arrays of structures offer wonderful semantics. Given RectArray[500] of type Rectangle, it's clear and obvious how to e.g. copy element 123 to element 456 and then some time later set the width of element 123 to 555, without disturbing element 456. "RectArray[432] = RectArray[321]; ...; RectArray[123].Width = 555;". Knowing that Rectangle is a struct with an integer field called Width will tell one all one needs to know about the above statements.
Now suppose RectClass was a class with the same fields as Rectangle and one wanted to do the same operations on a RectClassArray[500] of type RectClass. Perhaps the array is supposed to hold 500 pre-initialized immutable references to mutable RectClass objects. in that case, the proper code would be something like "RectClassArray[321].SetBounds(RectClassArray[456]); ...; RectClassArray[321].X = 555;". Perhaps the array is assumed to hold instances that aren't going to change, so the proper code would be more like "RectClassArray[321] = RectClassArray[456]; ...; RectClassArray[321] = New RectClass(RectClassArray[321]); RectClassArray[321].X = 555;" To know what one is supposed to do, one would have to know a lot more both about RectClass (e.g. does it support a copy constructor, a copy-from method, etc.) and the intended usage of the array. Nowhere near as clean as using a struct.
To be sure, there is unfortunately no nice way for any container class other than an array to offer the clean semantics of a struct array. The best one could do, if one wanted a collection to be indexed with e.g. a string, would probably be to offer a generic "ActOnItem" method which would accept a string for the index, a generic parameter, and a delegate which would be passed by reference both the generic parameter and the collection item. That would allow nearly the same semantics as struct arrays, but unless the vb.net and C# people can be pursuaded to offer a nice syntax, the code is going to be clunky-looking even if it is reasonably performance (passing a generic parameter would allow for use of a static delegate and would avoid any need to create any temporary class instances).
Personally, I'm peeved at the hatred Eric Lippert et al. spew regarding mutable value types. They offer much cleaner semantics than the promiscuous reference types that are used all over the place. Despite some of the limitations with .net's support for value types, there are many cases where mutable value types are a better fit than any other kind of entity.
Rectangle
I could easily come up with a common sitation where you get highly unclear behaviour. Consider that WinForms implements a mutable Rectangle
type used in the form's Bounds
property. If I want to change bounds I would want to use your nice syntax: form.Bounds.X = 10;
However this changes precisely nothing on the form (and generates a lovely error informing you of such). Inconsistency is the bane of programming and is why immutability is wanted. –
Inseparable Rectangle
with a RectangleF
with slightly fewer changes is hardly a common situation and moreso still requires changing the declared type at any declaration sites. I think it's a stretch to claim that being able to slightly modify a type easier is worth the loss of clarity clearly shown in the answers here. –
Inseparable There are a couple other corner cases that could lead to unpredictable behavior from the programmer's point of view.
// Simple mutable structure.
// Method IncrementI mutates current state.
struct Mutable
{
public Mutable(int i) : this()
{
I = i;
}
public void IncrementI() { I++; }
public int I { get; private set; }
}
// Simple class that contains Mutable structure
// as readonly field
class SomeClass
{
public readonly Mutable mutable = new Mutable(5);
}
// Simple class that contains Mutable structure
// as ordinary (non-readonly) field
class AnotherClass
{
public Mutable mutable = new Mutable(5);
}
class Program
{
void Main()
{
// Case 1. Mutable readonly field
var someClass = new SomeClass();
someClass.mutable.IncrementI();
// still 5, not 6, because SomeClass.mutable field is readonly
// and compiler creates temporary copy every time when you trying to
// access this field
Console.WriteLine(someClass.mutable.I);
// Case 2. Mutable ordinary field
var anotherClass = new AnotherClass();
anotherClass.mutable.IncrementI();
// Prints 6, because AnotherClass.mutable field is not readonly
Console.WriteLine(anotherClass.mutable.I);
}
}
Suppose we have an array of our Mutable
struct and we're calling the IncrementI
method for the first element of that array. What behavior are you expecting from this call? Should it change the array's value or only a copy?
Mutable[] arrayOfMutables = new Mutable[1];
arrayOfMutables[0] = new Mutable(5);
// Now we actually accessing reference to the first element
// without making any additional copy
arrayOfMutables[0].IncrementI();
// Prints 6!!
Console.WriteLine(arrayOfMutables[0].I);
// Every array implements IList<T> interface
IList<Mutable> listOfMutables = arrayOfMutables;
// But accessing values through this interface lead
// to different behavior: IList indexer returns a copy
// instead of an managed reference
listOfMutables[0].IncrementI(); // Should change I to 7
// Nope! we still have 6, because previous line of code
// mutate a copy instead of a list value
Console.WriteLine(listOfMutables[0].I);
So, mutable structs are not evil as long as you and the rest of the team clearly understand what you are doing. But there are too many corner cases when the program behavior would be different from what's expected, that could lead to subtle hard to produce and hard to understand errors.
T[]
and an integer index, and providing that an access to a property of type ArrayRef<T>
will be interpreted as an access to the appropriate array element) [if a class wanted to expose an ArrayRef<T>
for any other purpose, it could provide a method--as opposed to a property--to retrieve it]. Unfortunately, no such provisions exist. –
Willardwillcox public static void IncrementI(ref Mutable m) { m.I++; }
then the compiler should stop you from doing the "wrong" things most at the time. –
Kinslow Value types basically represents immutable concepts. Fx, it makes no sense to have a mathematical value such as an integer, vector etc. and then be able to modify it. That would be like redefining the meaning of a value. Instead of changing a value type, it makes more sense to assign another unique value. Think about the fact that value types are compared by comparing all the values of its properties. The point is that if the properties are the same then it is the same universal representation of that value.
As Konrad mentions it doesn't make sense to change a date either, as the value represents that unique point in time and not an instance of a time object which has any state or context-dependency.
Hopes this makes any sense to you. It is more about the concept you try to capture with value types than practical details, to be sure.
int
iterator, which would be completely useless if it were immutable. I think you're conflating “value types' compiler/runtime implementations” with “variables typed to a value type”— the latter is certainly mutable to any of the possible values. –
Encounter If you have ever programmed in a language like C/C++, structs are fine to use as mutable. Just pass them with ref, around and there is nothing that can go wrong. The only problem I find are the restrictions of the C# compiler and that, in some cases, I am unable to force the stupid thing to use a reference to the struct, instead of a Copy(like when a struct is part of a C# class).
So, mutable structs are not evil, C# has made them evil. I use mutable structs in C++ all the time and they are very convenient and intuitive. In contrast, C# has made me to completely abandon structs as members of classes because of the way they handle objects. Their convenience has cost us ours.
readonly
, but if one avoids doing those things class fields of structure types are just fine. The only really fundamental limitation of structures is that a struct field of a mutable class type like int[]
may encapsulate identity or an unchanging set of values, but cannot be used to encapsulate mutable values without also encapsulating an unwanted identity. –
Willardwillcox Imagine you have an array of 1,000,000 structs. Each struct representing an equity with stuff like bid_price, offer_price (perhaps decimals) and so on, this is created by C#/VB.
Imagine that array is created in a block of memory allocated in the unmanaged heap so that some other native code thread is able to concurrently access the array (perhaps some high-perf code doing math).
Imagine the C#/VB code is listening to a market feed of price changes, that code may have to access some element of the array (for whichever security) and then modify some price field(s).
Imagine this is being done tens or even hundreds of thousands of times per second.
Well lets face facts, in this case we really do want these structs to be mutable, they need to be because they are being shared by some other native code so creating copies isn't gonna help; they need to be because making a copy of some 120 byte struct at these rates is lunacy, especially when an update may actually impact just a byte or two.
Hugo
If you stick to what structs are intended for (in C#, Visual Basic 6, Pascal/Delphi, C++ struct type (or classes) when they are not used as pointers), you will find that a structure is not more than a compound variable. This means: you will treat them as a packed set of variables, under a common name (a record variable you reference members from).
I know that would confuse a lot of people deeply used to OOP, but that's not enough reason to say such things are inherently evil, if used correctly. Some structures are inmutable as they intend (this is the case of Python's namedtuple
), but it is another paradigm to consider.
Yes: structs involve a lot of memory, but it will not be precisely more memory by doing:
point.x = point.x + 1
compared to:
point = Point(point.x + 1, point.y)
The memory consumption will be at least the same, or even more in the inmutable case (although that case would be temporary, for the current stack, depending on the language).
But, finally, structures are structures, not objects. In POO, the main property of an object is their identity, which most of the times is not more than its memory address. Struct stands for data structure (not a proper object, and so they don't have identity anyhow), and data can be modified. In other languages, record (instead of struct, as is the case for Pascal) is the word and holds the same purpose: just a data record variable, intended to be read from files, modified, and dumped into files (that is the main use and, in many languages, you can even define data alignment in the record, while that's not necessarily the case for properly called Objects).
Want a good example? Structs are used to read files easily. Python has this library because, since it is object-oriented and has no support for structs, it had to implement it in another way, which is somewhat ugly. Languages implementing structs have that feature... built-in. Try reading a bitmap header with an appropriate struct in languages like Pascal or C. It will be easy (if the struct is properly built and aligned; in Pascal you would not use a record-based access but functions to read arbitrary binary data). So, for files and direct (local) memory access, structs are better than objects. As for today, we're used to JSON and XML, and so we forget the use of binary files (and as a side effect, the use of structs). But yes: they exist, and have a purpose.
They are not evil. Just use them for the right purpose.
If you think in terms of hammers, you will want to treat screws as nails, to find screws are harder to plunge in the wall, and it will be screws' fault, and they will be the evil ones.
When something can be mutated, it gains a sense of identity.
struct Person {
public string name; // mutable
public Point position = new Point(0, 0); // mutable
public Person(string name, Point position) { ... }
}
Person eric = new Person("Eric Lippert", new Point(4, 2));
Because Person
is mutable, it's more natural to think about changing Eric's position than cloning Eric, moving the clone, and destroying the original. Both operations would succeed in changing the contents of eric.position
, but one is more intuitive than the other. Likewise, it's more intuitive to pass Eric around (as a reference) for methods to modify him. Giving a method a clone of Eric is almost always going to be surprising. Anyone wanting to mutate Person
must remember to ask for a reference to Person
or they'll be doing the wrong thing.
If you make the type immutable, the problem goes away; if I can't modify eric
, it makes no difference to me whether I receive eric
or a clone of eric
. More generally, a type is safe to pass by value if all of its observable state is held in members that are either:
If those conditions are met then a mutable value type behaves like a reference type because a shallow copy will still allow the receiver to modify the original data.
The intuitiveness of an immutable Person
depends on what you're trying to do though. If Person
just represents a set of data about a person, there's nothing unintuitive about it; Person
variables truly represent abstract values, not objects. (In that case, it'd probably be more appropriate to rename it to PersonData
.) If Person
is actually modeling a person itself, the idea of constantly creating and moving clones is silly even if you've avoided the pitfall of thinking you're modifying the original. In that case it'd probably be more natural to simply make Person
a reference type (that is, a class.)
Granted, as functional programming has taught us there are benefits to making everything immutable (no one can secretly hold on to a reference to eric
and mutate him), but since that's not idiomatic in OOP it's still going to be unintuitive to anyone else working with your code.
foo
holds the only reference to its target anywhere in the universe, and nothing has captured that object's identity-hash value, then mutating field foo.X
is semantically equivalent to making foo
point to a new object which is just like the one it previously referred to, but with X
holding the desired value. With class types, it's generally hard to know whether multiple references exist to something, but with structs it's easy: they don't. –
Willardwillcox Thing
is a mutable class type, a Thing[]
will encapsulate object identities--whether one wants it to or not--unless one can ensure that no Thing
in the array to which any outside references exist will ever be mutated. If one doesn't want the array elements to encapsulate identity, one must generally ensure either that no items to which it holds references will ever be mutated, or that no outside references will ever exist to any items it holds [hybrid approaches can also work]. Neither approach is terribly convenient. If Thing
is a structure, a Thing[]
encapsulates values only. –
Willardwillcox It doesn’t have anything to do with structs (and not with C#, either) but in Java you might get problems with mutable objects when they are e.g. keys in a hash map. If you change them after adding them to a map and it changes its hash code, evil things might happen.
There are many advantages and disadvantages to mutable data. The million-dollar disadvantage is aliasing. If the same value is being used in multiple places, and one of them changes it, then it will appear to have magically changed to the other places that are using it. This is related to, but not identical with, race conditions.
The million-dollar advantage is modularity, sometimes. Mutable state can allow you to hide changing information from code that doesn't need to know about it.
The Art of the Interpreter goes into these trade offs in some detail, and gives some examples.
Personally when I look at code the following looks pretty clunky to me:
data.value.set ( data.value.get () + 1 ) ;
rather than simply
data.value++ ; or data.value = data.value + 1 ;
Data encapsulation is useful when passing a class around and you want to ensure the value is modified in a controlled fashion. However when you have public set and get functions that do little more than set the value to what ever is passed in, how is this an improvement over simply passing a public data structure around?
When I create a private structure inside a class, I created that structure to organize a set of variables into one group. I want to be able to modify that structure within the class scope, not get copies of that structure and create new instances.
To me this prevents a valid use of structures being used to organize public variables, if I wanted access control I'd use a class.
There are several issues with Mr. Eric Lippert's example. It is contrived to illustrate the point that structs are copied and how that could be a problem if you are not careful. Looking at the example I see it as a result of a bad programming habit and not really a problem with either struct or the class.
A struct is supposed to have only public members and should not require any encapsulation. If it does then it really should be a type/class. You really do not need two constructs to say the same thing.
If you have class enclosing a struct, you would call a method in the class to mutate the member struct. This is what I would do as a good programming habit.
A proper implementation would be as follows.
struct Mutable {
public int x;
}
class Test {
private Mutable m = new Mutable();
public int mutate()
{
m.x = m.x + 1;
return m.x;
}
}
static void Main(string[] args) {
Test t = new Test();
System.Console.WriteLine(t.mutate());
System.Console.WriteLine(t.mutate());
System.Console.WriteLine(t.mutate());
}
It looks like it is an issue with programming habit as opposed to an issue with struct itself. Structs are supposed to be mutable, that is the idea and intent.
The result of the changes voila behaves as expected:
1 2 3 Press any key to continue . . .
I don't believe they're evil if used correctly. I wouldn't put it in my production code, but I would for something like structured unit testing mocks, where the lifespan of a struct is relatively small.
Using the Eric example, perhaps you want to create a second instance of that Eric, but make adjustments, as that's the nature of your test (ie duplication, then modifying). It doesn't matter what happens with the first instance of Eric if we're just using Eric2 for the remainder of the test script, unless you're planning on using him as a test comparison.
This would be mostly useful for testing or modifying legacy code that shallow defines a particular object (the point of structs), but by having an immutable struct, this prevents it's usage annoyingly.
Range<T>
type with members Minimum
and Maximum
fields of type T
, and code Range<double> myRange = foo.getRange();
, any guarantees about what Minimum
and Maximum
contain should come from foo.GetRange();
. Having Range
be an exposed-field struct would make clear that it's not going to add any behavior of its own. –
Willardwillcox © 2022 - 2024 — McMap. All rights reserved.
int
s,bool
s, and all other value types are evil. There are cases for mutability and for immutability. Those cases hinge on the role the data plays, not the type of memory allocation/sharing. – Encounterint
andbool
are not mutable.. – Botvinnik.
-syntax, making operations with ref-typed data and value-typed data look the same even though they're distinctly different. This is a fault of C#'s properties, not structs— some languages offer an alternatea[V][X] = 3.14
syntax for mutating in-place. In C#, you'd do better to offer struct-member mutator methods like ’MutateV(Action<ref Vector2> mutator)` and use it likea.MutateV((v) => { v.X = 3; })
(example is over-simplified because of the limitations C# has regarding theref
keyword, but with some workarounds should be possible). – Encounterref
keyword in a generic type isn't ambiguous; it's just not supported. That's a limitation. And yes, C# in many ways forces inconvenient syntax to get the job done. The most classic example of C#'s clumsiness ispublic
over and over and over on every member rather than a C++-stylepublic:
prefix or a Ruby-style separatepublic FieldName1, FieldName2, MethodName1, MethodName2;
– Encounterx
) this single operation is 4 instructions:ldloc.0
(loads the 0-index variable into... – DeerhoundT
is type. Ref is just a keyword that makes variable being passed to a method itself, not a copy of it. It also has sense for the reference types, since we can change the variable, i.e. the reference outside the method will point to other object after being changed within the method. Sinceref T
is not a type, but fashion of passing a method parameter, you cannot put it into<>
, cause only types can be put there. So it's just incorrect. Maybe it would be convenient to do so, maybe the C# team could make this for some new version, but right now they're working on some... – Deerhoundpublic
... I actually like the C# way of doing this. It's actually easier to refactor, cause deleting/adding one public in one place for one member does not affect all other members. In C++ I often have to add two labels (e.g. public: and private: again), or move the method to other part of class. Inconvenient. Anyway, mentioning this topic made me realise that we should end that discussion. We are starting to argue about our own opinions, and we could battle for years and not find a common.. – Deerhound