Does swift copy on write for all structs?
Asked Answered
C

2

41

I know that swift will optimize to copy on write for arrays but will it do this for all structs? For example:

struct Point {
   var x:Float = 0
}

var p1 = Point()
var p2 = p1 //p1 and p2 share the same data under the hood
p2.x += 1 //p2 now has its own copy of the data
Chickweed answered 19/4, 2017 at 4:29 Comment(1)
Nitpick: This behaviour is a property of the Swift compiler, not of the Swift language. So long as the program behaviour is in line with the language specification, the compiler is free to do what it sees fitPaludal
J
53

Array is implemented with copy-on-write behaviour – you'll get it regardless of any compiler optimisations (although of course, optimisations can decrease the number of cases where a copy needs to happen).

At a basic level, Array is just a structure that holds a reference to a heap-allocated buffer containing the elements – therefore multiple Array instances can reference the same buffer. When you come to mutate a given array instance, the implementation will check if the buffer is uniquely referenced, and if so, mutate it directly. Otherwise, the array will perform a copy of the underlying buffer in order to preserve value semantics.

However, with your Point structure – you're not implementing copy-on-write at a language level. Of course, as @Alexander says, this doesn't stop the compiler from performing all sorts of optimisations to minimise the cost of copying whole structures about. These optimisations needn't follow the exact behaviour of copy-on-write though – the compiler is simply free to do whatever it wishes, as long as the program runs according to the language specification.

In your specific example, both p1 and p2 are global, therefore the compiler needs to make them distinct instances, as other .swift files in the same module have access to them (although this could potentially be optimised away with whole-module optimisation). However, the compiler still doesn't need to copy the instances – it can just evaluate the floating-point addition at compile-time and initialise one of the globals with 0.0, and the other with 1.0.

And if they were local variables in a function, for example:

struct Point {
    var x: Float = 0
}

func foo() {
    var p1 = Point()
    var p2 = p1
    p2.x += 1
    print(p2.x)
}

foo()

The compiler doesn't even have to create two Point instances to begin with – it can just create a single floating-point local variable initialised to 1.0, and print that.

Regarding passing value types as function arguments, for large enough types and (in the case of structures) functions that utilise enough of their properties, the compiler can pass them by reference rather than copying. The callee can then make a copy of them only if needed, such as when needing to work with a mutable copy.

In other cases where structures are passed by value, it's also possible for the compiler to specialise functions in order to only copy across the properties that the function needs.

For the following code:

struct Point {
    var x: Float = 0
    var y: Float = 1
}

func foo(p: Point) {
    print(p.x)
}

var p1 = Point()
foo(p: p1)

Assuming foo(p:) isn't inlined by the compiler (it will in this example, but once its implementation reaches a certain size, the compiler won't think it worth it) – the compiler can specialise the function as:

func foo(px: Float) {
    print(px)
}

foo(px: 0)

It only passes the value of Point's x property into the function, thereby saving the cost of copying the y property.

So the compiler will do whatever it can in order to reduce the copying of value types. But with so many various optimisations in different circumstances, you cannot simply boil the optimised behaviour of arbitrary value types down to just copy-on-write.

Jailer answered 19/4, 2017 at 10:56 Comment(6)
So in Xcode with whole module optimization turned on, if I create a struct with var and then pass it around to a bunch of functions that do NOT mutate the struct will Xcode optimize away all those copies?Chickweed
@Chickweed It depends on the functions and the structure, but yes, it's fully possible – just found out (by going through the IR for an optimised build) that for large enough structures, Swift can pass them by reference to functions, therefore fully eliminating the copying (that is, until the callee mutates a copy). But with so many various optimisations and corner cases where they cannot be applied, you cannot simply boil the behaviour down to copy-on-write. Is there an actual performance bottleneck you're worried about, or are you just curious?Jailer
Well I wrote a game engine in swift/metal. I pass around a lot of structs that represent drawing commands to be consumed by the GPU and current frame data. At the time I thought all my structures would employ COW to avoid wasted copies, but then I learned that there was actually a lot of disagreement over what Xcode actually does. So I became worried my engine was not as optimized as I thought. My game runs at 60fps so right now it is not an issue, just worried it won't scale well for future projects.Chickweed
@Chickweed If it's not currently a performance bottleneck – I really wouldn't worry about it. As said, the compiler is able to perform lots of optimisations to reduce the amount of copying of value types. If it becomes a problem later down the line, you can relatively easily refactor your structure(s) to use copy-on-write; but you should only do so after identifying it as an issue when profiling, and after seeing that making the change actually boosts performance...Jailer
as implementing copy-on-write at a language level requires references, and therefore comes with the cost of both heap allocation and reference counting. Attempting to change your logic now without knowing for certain whether you're making things better or worse would be counterproductive.Jailer
@Jailer CoW also requires adding a branch (to check if a copy is necessary) on every mutating method. Between branch prediction and speculative execution, I'm not sure how it would play out, but I'm reasonably certain that it would be slower than unconditionally copying all small structs.Paludal
D
6

Swift Copy On Write(COW)

Make a copy only when it is necessary(e.g. when we change/write). By default Value Type[About] does not support COW mechanism. But some of system structures like Collections(Array, Dictionary, Set) support it

Print address

// Print memory address
func address(_ object: UnsafeRawPointer) -> String {
    let address = Int(bitPattern: object)
    return NSString(format: "%p", address) as String
}

Value type default behaviour

struct A {
    var value: Int = 0
}

//Default behavior(COW is not used)
var a1 = A()
var a2 = a1

//different addresses
print(address(&a1)) //0x7ffee48f24a8
print(address(&a2)) //0x7ffee48f24a0

//COW for a2 is not used
a2.value = 1
print(address(&a2)) //0x7ffee48f24a0

Value type with COW (Collection)

//collection(COW is realized)
var collection1 = [A()]
var collection2 = collection1

//same addresses
print(address(&collection1)) //0x600000c2c0e0
print(address(&collection2)) //0x600000c2c0e0

//COW for collection2 is used
collection2.append(A())
print(address(&collection2)) //0x600000c2c440

Use COW semantics for large values to minimise copying data every time. There are two common ways:

  1. use a wrapper with value type which support COW.
  2. use a wrapper which has a reference to heap where we can save large data. The point is:
  • we are able to create multiple copies of lite wrapper which will be pointed to the same large data in a heap
  • when we try to modify(write) a new reference with a copy of large data will be created - COW in action. AnyObject.isKnownUniquelyReferenced() which can say if there is a single reference to this object
struct Box<T> {
    fileprivate var ref: Ref<T>
    
    init(value: T) {
        self.ref = Ref(value: value)
    }

    var value: T {
        get {
            return ref.value
        }

        set {
            //it is true when there is only one(single) reference to this object
            //that is why it is safe to update,
            //if not - new reference to heap is created with a copy of value
            if (isKnownUniquelyReferenced(&self.ref)) {
                self.ref.value = newValue
            } else {
                self.ref = Ref(value: newValue)
            }
        }
    }
    
    final class Ref<T> {
        var value: T
        init(value: T) {
            self.value = value
        }
    }
}
let value = 0

var box1 = Box(value: value)
var box2 = box1

//same addresses
print(address(&box1.ref.value)) //0x600000ac2490
print(address(&box2.ref.value)) //0x600000ac2490

box2.value = 1

print(box1.value) //0
print(box2.value) //1

//COW in action
//different addresses
print(address(&box1.ref.value)) //0x600000ac2490
print(address(&box2.ref.value)) //0x600000a9dd30
Dresser answered 7/12, 2021 at 18:39 Comment(2)
also, see: github.com/apple/swift/blob/main/docs/…Biel
//not same addresses print(address(&box1)) //0x7ffee11b53d0 print(address(&box2)) //0x7ffee11b53c0Zugzwang

© 2022 - 2024 — McMap. All rights reserved.