UnsafeMutablePointer.pointee and didSet properties
Asked Answered
G

2

5

I got some unexpected behavior using UnsafeMutablePointer on an observed property in a struct I created (on Xcode 10.1, Swift 4.2). See the following playground code:

struct NormalThing {
    var anInt = 0
}

struct IntObservingThing {
    var anInt: Int = 0 {
        didSet {
            print("I was just set to \(anInt)")
        }
    }
}

var normalThing = NormalThing(anInt: 0)
var ptr = UnsafeMutablePointer(&normalThing.anInt)
ptr.pointee = 20
print(normalThing.anInt) // "20\n"

var intObservingThing = IntObservingThing(anInt: 0)
var otherPtr = UnsafeMutablePointer(&intObservingThing.anInt)
// "I was just set to 0."

otherPtr.pointee = 20
print(intObservingThing.anInt) // "0\n"

Seemingly, modifying the pointee on an UnsafeMutablePointer to an observed property doesn't actually modify the value of the property. Also, the act of assigning the pointer to the property fires the didSet action. What am I missing here?

Gradygrae answered 22/12, 2018 at 0:37 Comment(0)
L
6

Any time you see a construct like UnsafeMutablePointer(&intObservingThing.anInt), you should be extremely wary about whether it'll exhibit undefined behaviour. In the vast majority of cases, it will.

First, let's break down exactly what's happening here. UnsafeMutablePointer doesn't have any initialisers that take inout parameters, so what initialiser is this calling? Well, the compiler has a special conversion that allows a & prefixed argument to be converted to a mutable pointer to the 'storage' referred to by the expression. This is called an inout-to-pointer conversion.

For example:

func foo(_ ptr: UnsafeMutablePointer<Int>) {
  ptr.pointee += 1
}

var i = 0
foo(&i)
print(i) // 1

The compiler inserts a conversion that turns &i into a mutable pointer to i's storage. Okay, but what happens when i doesn't have any storage? For example, what if it's computed?

func foo(_ ptr: UnsafeMutablePointer<Int>) {
  ptr.pointee += 1
}

var i: Int {
  get { return 0 }
  set { print("newValue = \(newValue)") }
}
foo(&i)
// prints: newValue = 1

This still works, so what storage is being pointed to by the pointer? To solve this problem, the compiler:

  1. Calls i's getter, and places the resultant value into a temporary variable.
  2. Gets a pointer to that temporary variable, and passes that to the call to foo.
  3. Calls i's setter with the new value from the temporary.

Effectively doing the following:

var j = i // calling `i`'s getter
foo(&j)
i = j     // calling `i`'s setter

It should hopefully be clear from this example that this imposes an important constraint on the lifetime of the pointer passed to foo – it can only be used to mutate the value of i during the call to foo. Attempting to escape the pointer and using it after the call to foo will result in a modification of only the temporary variable's value, and not i.

For example:

func foo(_ ptr: UnsafeMutablePointer<Int>) -> UnsafeMutablePointer<Int> {
  return ptr
}

var i: Int {
  get { return 0 }
  set { print("newValue = \(newValue)") }
}
let ptr = foo(&i)
// prints: newValue = 0
ptr.pointee += 1

ptr.pointee += 1 takes place after i's setter has been called with the temporary variable's new value, therefore it has no effect.

Worse than that, it exhibits undefined behaviour, as the compiler doesn't guarantee that the temporary variable will remain valid after the call to foo has ended. For example, the optimiser could de-initialise it immediately after the call.

Okay, but as long as we only get pointers to variables that aren't computed, we should be able to use the pointer outside of the call it was passed to, right? Unfortunately not, turns out there's lots of other ways to shoot yourself in the foot when escaping inout-to-pointer conversions!

To name just a few (there are many more!):

  • A local variable is problematic for a similar reason to our temporary variable from earlier – the compiler doesn't guarantee that it will remain initialised until the end of the scope it's declared in. The optimiser is free to de-initialise it earlier.

    For example:

    func bar() {
      var i = 0
      let ptr = foo(&i)
      // Optimiser could de-initialise `i` here.
    
      // ... making this undefined behaviour!
      ptr.pointee += 1
    }
    
  • A stored variable with observers is problematic because under the hood it's actually implemented as a computed variable that calls its observers in its setter.

    For example:

    var i: Int = 0 {
      willSet(newValue) {
        print("willSet to \(newValue), oldValue was \(i)")
      }
      didSet(oldValue) {
        print("didSet to \(i), oldValue was \(oldValue)")
      }
    }
    

    is essentially syntactic sugar for:

    var _i: Int = 0
    
    func willSetI(newValue: Int) {
      print("willSet to \(newValue), oldValue was \(i)")
    }
    
    func didSetI(oldValue: Int) {
      print("didSet to \(i), oldValue was \(oldValue)")
    }
    
    var i: Int {
      get {
        return _i
      }
      set {
        willSetI(newValue: newValue)
        let oldValue = _i
        _i = newValue
        didSetI(oldValue: oldValue)
      }
    }
    
  • A non-final stored property on classes is problematic as it can be overridden by a computed property.

And this isn't even considering cases that rely on implementation details within the compiler.

For this reason, the compiler only guarantees stable and unique pointer values from inout-to-pointer conversions on stored global and static stored variables without observers. In any other case, attempting to escape and use a pointer from an inout-to-pointer conversion after the call it was passed to will lead to undefined behaviour.


Okay, but how does my example with the function foo relate to your example of calling an UnsafeMutablePointer initialiser? Well, UnsafeMutablePointer has an initialiser that takes an UnsafeMutablePointer argument (as a result of conforming to the underscored _Pointer protocol which most standard library pointer types conform to).

This initialiser is effectively same as the foo function – it takes an UnsafeMutablePointer argument and returns it. Therefore when you do UnsafeMutablePointer(&intObservingThing.anInt), you're escaping the pointer produced from the inout-to-pointer conversion – which, as we've discussed, is only valid if it's used on a stored global or static variable without observers.

So, to wrap things up:

var intObservingThing = IntObservingThing(anInt: 0)
var otherPtr = UnsafeMutablePointer(&intObservingThing.anInt)
// "I was just set to 0."

otherPtr.pointee = 20

is undefined behaviour. The pointer produced from the inout-to-pointer conversion is only valid for the duration of the call to UnsafeMutablePointer's initialiser. Attempting to use it afterwards results in undefined behaviour. As matt demonstrates, if you want scoped pointer access to intObservingThing.anInt, you want to use withUnsafeMutablePointer(to:).

I'm actually currently working on implementing a warning (which will hopefully transition to an error) that will be emitted on such unsound inout-to-pointer conversions. Unfortunately I haven't had much time lately to work on it, but all things going well, I'm aiming to start pushing it forwards in the new year, and hopefully get it into a Swift 5.x release.

In addition, it's worth noting that while the compiler doesn't currently guarantee well-defined behaviour for:

var normalThing = NormalThing(anInt: 0)
var ptr = UnsafeMutablePointer(&normalThing.anInt)
ptr.pointee = 20

From the discussion on #20467, it looks like this will likely be something that the compiler does guarantee well-defined behaviour for in a future release, due to the fact that the base (normalThing) is a fragile stored global variable of a struct without observers, and anInt is a fragile stored property without observers.

Lightman answered 22/12, 2018 at 15:0 Comment(1)
Right, I sensed it had something to do with pointer lifetimes and undefined behavior but I couldn’t formulate the actual notion. I also knew you’d been delving into pointers which is why I flagged this on bugs.swift.org!Cristobal
C
4

I'm pretty sure the problem is that what you're doing is illegal. You can't just declare an unsafe pointer and claim that it points at the address of a struct property. (In fact, I don't even understand why your code compiles in the first place; what initializer does the compiler think this is?) The correct way, which gives the expected results, is to ask for a pointer that does point at that address, like this:

struct IntObservingThing {
    var anInt: Int = 0 {
        didSet {
            print("I was just set to \(anInt)")
        }
    }
}
withUnsafeMutablePointer(to: &intObservingThing.anInt) { ptr -> Void in
    ptr.pointee = 20 // I was just set to 20
}
print(intObservingThing.anInt) // 20
Cristobal answered 22/12, 2018 at 1:26 Comment(6)
"You can't just declare an unsafe pointer and claim that it points at the address of a struct property." - can you point to any documentation that states this? Not challenging it, but I'd like to see the reasoning behind this statement.Gradygrae
In regards to what initializer the compiler thinks this is - why would it think it is anything other than an initializer to UnsafeMutablePointer<Int> ?Gradygrae
Let's look at it this way. You write UnsafeMutablePointer(&intObservingThing.anInt). Now, if that is legal, it should be interchangeable with UnsafeMutablePointer.init(&intObservingThing.anInt). But if you write that, the compiler chokes — rightly. So I think your expression, by some bug in the compiler, is sneaking past the compiler's guard. I admit I'm being a bit hazy but I do think I've explained how to do this correctly and in such a way as to get the expected result.Cristobal
Definitely works this way. Does this mean that naked UnsafeMutablePointers aren't meant to be declared/initialized explicitly? ie. they will always be a parameter in some function?Gradygrae
No, I didn’t say that.Cristobal
@Cristobal The fact that UnsafeMutablePointer.init(&intObservingThing.anInt) behaves differently to UnsafeMutablePointer(&intObservingThing.anInt) is a bug where using .init prevents the compiler from preferring a non-optional param over an optional param in certain cases when overload ranking, leading to ambiguities. They should really both compile (well, at least until they hopefully get rejected in a future version for being totally unsound).Lightman

© 2022 - 2024 — McMap. All rights reserved.