C# Automatic deep copy of struct
Asked Answered
C

7

19

I have a struct, MyStruct, that has a private member private bool[] boolArray; and a method ChangeBoolValue(int index, bool Value).

I have a class, MyClass, that has a field public MyStruct bools { get; private set; }

When I create a new MyStruct object from an existing one, and then apply method ChangeBoolValue(), the bool array in both objects is changed, because the reference, not what was referred to, was copied to the new object. E.g:

MyStruct A = new MyStruct();
MyStruct B = A;  //Copy of A made
B.ChangeBoolValue(0,true);
//Now A.BoolArr[0] == B.BoolArr[0] == true

Is there a way of forcing a copy to implement a deeper copy, or is there a way to implement this that will not have the same issue?

I had specifically made MyStruct a struct because it was value type, and I did not want references propagating.

Catamenia answered 5/7, 2012 at 1:35 Comment(2)
Might be too late, and I don't know the details of your design or requirements, but generally speaking, mutable structs are evil. #946164Gad
Thanks, the answer in there essentially describes the problem I was having :PCatamenia
G
14

The runtime performs a fast memory copy of structs and as far as I know, it's not possible to introduce or force your own copying procedure for them. You could introduce your own Clone method or even a copy-constructor, but you could not enforce that they use them.

Your best bet, if possible, to make your struct immutable (or an immutable class) or redesign in general to avoid this issue. If you are the sole consumer of the API, then perhaps you can just remain extra vigilant.

Jon Skeet (and others) have described this issue and although there can be exceptions, generally speaking: mutable structs are evil. Can structs contain fields of reference types

Gad answered 5/7, 2012 at 2:9 Comment(0)
N
3

One simple method to make a (deep) copy, though not the fastest one (because it uses reflection), is to use BinaryFormatter to serialize the original object to a MemoryStream and then deserialize from that MemoryStream to a new MyStruct.

    static public T DeepCopy<T>(T obj)
    {
        BinaryFormatter s = new BinaryFormatter();
        using (MemoryStream ms = new MemoryStream())
        {
            s.Serialize(ms, obj);
            ms.Position = 0;
            T t = (T)s.Deserialize(ms);

            return t;
        }
    }

Works for classes and structs.

Neuman answered 5/7, 2012 at 1:39 Comment(9)
I'm not sure this necessarily answers the question. This requires that consumers of MyStruct call the DeepCopy method when passing around instances. I think 3Pi is asking how to do the deep copy automatically whenever the runtime copies the struct.Gad
@ChrisSinclair: Since assignment creates an alias (in his code object.ReferenceEquals(A, B) is true), simply using the assignment operator will not achieve what he's looking for. I'm not sure there is another way short of using a (extension) method.Neuman
@Chris, Yes, that is what I meant - was hoping to not have to change anything except the struct itself.Catamenia
@Eric, structs are value type. Doesn't that mean that when you assign 1 to another, that it is copied? If not, what does it mean for a struct to be value type?Catamenia
@Catamenia I'm not sure that's possible. The runtime does fast memory copying of structs and (as far as I know) there's no method to implement/override this behaviour. @EricJ Yeah, you're spot on about that. Though again, having an extension method will just be a convenience for the consumer. If you have control over usage of MyStruct, 3Pi, then that should be fine you just need to be vigilant about its use.Gad
@Catamenia it depends on the context. Passing a struct through a method (parameter or return value) or a property setter/getter will make a copy. Making a local reference within a method will not make a new copy.Gad
@Christ: Structs are value type. The assignment operator will create a shallow copy of it, not a reference to it. That would make it a reference type. If you have int A = 2; int B = A;, B is not a reference to A, it is a copy of the information. Structs behave the same way. The problem in my case was that the information contained in the struct was a reference, and so the reference was being copied.Catamenia
Ahh yeah, you're right. I don't know where my mind is at tonight cracks open anther beer Even in a local context, it's not an alias, it's still a copy.Gad
@ChrisSinclair not only that, but if you have an int variable x (or indeed any value type) then object.ReferenceEquals(x, x) returns false, because it checks for reference equality of two separate boxed instances of the same value.Sepulveda
C
2

As a workaround, I am going to implement the following.

There are 2 methods in the struct that can modify the contents of BoolArray. Rather than creating the array when the struct is copied, BoolArray will be created anew when a call to change it is made, as follows

public void ChangeBoolValue(int index, int value)
{
    bool[] Copy = new bool[4];
    BoolArray.CopyTo(Copy, 0);
    BoolArray = Copy;

    BoolArray[index] = value;
}

Though this would be bad for any uses that involved much change of the BoolArray, my use of the struct is a lot of copying, and very little changing. This will only change the reference to the array when a change is required.

Catamenia answered 5/7, 2012 at 1:58 Comment(2)
why don't you use a persistent immutable collection type? Eric Lippert has a good series on these in his blog.Sepulveda
I have actually gone for the immutable struct, as suggested by Chris. It required the least external change, and little internal change. This was an idea that occurred to me, but would have been smelly code.Catamenia
G
2

Struct is copied when passed right? So:

public static class StructExts
{
    public static T Clone<T> ( this T val ) where T : struct => val;
}

Usage:

var clone = new AnyStruct ().Clone ();
Glans answered 31/1, 2020 at 16:12 Comment(1)
well when a struct is passed as a parameter, it's copied, so I just return the copy. Note that if the struct contains classes, they aren't duplicated, only their references. Read more about shallow and deep copy.Glans
E
1

To avoid weird semantics, any struct which holds a field of a mutable reference type must do one of two things:

  1. It should make very clear that, from its perspective, the the content of the field serves not to "hold" an object, but merely to identify one. For example, a `KeyValuePair<String, Control>` would be a perfectly reasonable type, since although `Control` is mutable, the identity of a control referenced by such a type would be immutable.
  2. The mutable object must be one which is created by the value type, will never be exposed outside it. Further, any mutations that will ever be performed upon the immutable object must be performed before a reference to the object is stored into any field of the struct.

As others have noted, one way to allow a struct to simulate an array would be for it to hold an array, and make a new copy of that array any time an element is modified. Such a thing would, of course, be outrageously slow. An alternative approach would be to add some logic to store the indices and values of the last few mutations requests; any time an attempt is made to read the array, check whether the value is one of the recently-written ones and, if so, use the value stored in the struct instead of the one in the array. Once all of the 'slots' within the struct are filled up, make a copy of the array. This approach would at best "only" offer a constant speed up versus regenerating the array if updates hit many different elements, but could be helpful if the extremely vast majority of updates hit a small number of elements.

Another approach when updates are likely to have a high special concentration, but hit too many elements for them to fit entirely within a struct, would be to keep a reference to a "main" array, as well as an "updates" array along with an integer indicating what part of the main array the "updates" array represents. Updates would often require regeneration of the "updates" array, but that could be much smaller than the main array; if the "updates" array gets too big, the main array can be regenerated with changes represented by the "updates" array incorporated within it.

The biggest problem with any of these approaches is that while the struct could be engineered in such a way as to present consistent value-type semantics while allowing efficient copying, a glance at the struct's code would hardly make that obvious (as compared with plain-old-data structs, where the fact that the struct has a public field called Foo makes it very clear how Foo will behave).

Experienced answered 5/7, 2012 at 20:31 Comment(0)
F
0

I was thinking about a similar issue related to value types, and found out a "solution" to this. You see, you cannot change the default copy constructor in C# like you can in C++, because it's intended to be lightweight and side effects-free. However, what you can do is wait until you actually access the struct, and then check if it was copied.

The problem with this is that unlike reference types, structs have no real identity; there is only by-value equality. However, they still have to be stored at some place in memory, and this address can be used to identify (albeit temporarily) a value type. The GC is a concern here, because it can move objects around, and therefore change the address at which the struct is located, so you would have to be able to cope with that (e.g. make the struct's data private).

In practice, the address of the struct can be obtained from the this reference, because it's a simple ref T in case of a value type. I leave the means to obtain the address from a reference to my library, but it's quite simple to emit custom CIL for that. In this example, I create something what is essentially a byval array.

public struct ByValArray<T>
{
    //Backup field for cloning from.
    T[] array;

    public ByValArray(int size)
    {
        array = new T[size];
        //Updating the instance is really not necessary until we access it.
    }

    private void Update()
    {
        //This should be called from any public method on this struct.
        T[] inst = FindInstance(ref this);
        if(inst != array)
        {
            //A new array was cloned for this address.
            array = inst;
        }
    }

    //I suppose a GCHandle would be better than WeakReference,
    //but this is sufficient for illustration.
    static readonly Dictionary<IntPtr, WeakReference<T[]>> Cache = new Dictionary<IntPtr, WeakReference<T[]>>();

    static T[] FindInstance(ref ByValArray<T> arr)
    {
        T[] orig = arr.array;
        return UnsafeTools.GetPointer(
            //Obtain the address from the reference.
            //It uses a lambda to minimize the chance of the reference
            //being moved around by the GC.
            out arr,
            ptr => {
                WeakReference<T[]> wref;
                T[] inst;
                if(Cache.TryGetValue(ptr, out wref) && wref.TryGetTarget(out inst))
                {
                    //An object is found on this address.
                    if(inst != orig)
                    {
                        //This address was overwritten with a new value,
                        //clone the instance.
                        inst = (T[])orig.Clone();
                        Cache[ptr] = new WeakReference<T[]>(inst);
                    }
                    return inst;
                }else{
                    //No object was found on this address,
                    //clone the instance.
                    inst = (T[])orig.Clone();
                    Cache[ptr] = new WeakReference<T[]>(inst);
                    return inst;
                }
            }
        );
    }

    //All subsequent methods should always update the state first.
    public T this[int index]
    {
        get{
            Update();
            return array[index];
        }
        set{
            Update();
            array[index] = value;
        }
    }

    public int Length{
        get{
            Update();
            return array.Length;
        }
    }

    public override bool Equals(object obj)
    {
        Update();
        return base.Equals(obj);
    }

    public override int GetHashCode()
    {
        Update();
        return base.GetHashCode();
    }

    public override string ToString()
    {
        Update();
        return base.ToString();
    }
}

var a = new ByValArray<int>(10);
a[5] = 11;
Console.WriteLine(a[5]); //11

var b = a;
b[5]++;
Console.WriteLine(b[5]); //12
Console.WriteLine(a[5]); //11

var c = a;
a = b;
Console.WriteLine(a[5]); //12
Console.WriteLine(c[5]); //11

As you can see, this value type behaves exactly as if the underlying array was copied to a new location every time the reference to the array is copied.

WARNING!!! Use this code only at your own risk, and preferably never in a production code. This technique is wrong and evil at so many levels, because it assumes identity for something that shouldn't have it. Although this tries to "enforce" value type semantics for this struct ("the end justifies the means"), there are certainly better solutions to the real problem in almost any case. Also please note that although I have tried to foresee any foreseeable issues with this, there could be cases where this type will show quite an unexpected behaviour.

Flew answered 8/8, 2017 at 23:29 Comment(0)
R
0
    unsafe struct MyStruct{///add unsafe
        private fixed bool boolArray[100];///set to fixed array
        public void ChangeBoolValue(int index, bool Value) {
            boolArray[index] = Value;
        }
    }

    MyStruct copyMyStruct(MyStruct copy){
        return copy;
    }

Project->Properties->Build->Allow unsafe code = checked

Project->Properties->Build->Optimize code = unchecked

    MyStruct A = new MyStruct();
    MyStruct B = copyMyStruct(A);//Copy of A made
    B.ChangeBoolValue(0,true);//B.BoolArr[0] == true , A.BoolArr[0] == false
   
Randirandie answered 10/4, 2022 at 2:12 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.