Double check lock optimization to implement thread-safe lazy-loading in Swift
Asked Answered
H

1

6

I have implemented what I think is a double check locking in a class to achieve thread safe lazy loading.

Just in case you wondered, this is for a DI library I'm currently working on.

The code I'm talking about is the following:

final class Builder<I> {

   private let body: () -> I

   private var instance: I?
   private let instanceLocker = NSLock()

   private var isSet = false
   private let isSetDispatchQueue = DispatchQueue(label: "\(Builder.self)", attributes: .concurrent)

   init(body: @escaping () -> I) {
       self.body = body
   }

   private var syncIsSet: Bool {
       set {
          isSetDispatchQueue.async(flags: .barrier) {
             self.isSet = newValue
          }
       }
       get {
          var isSet = false
          isSetDispatchQueue.sync {
              isSet = self.isSet
          }
          return isSet
       }
   }

   var value: I {

       if syncIsSet {
           return instance! // should never fail
       }

       instanceLocker.lock()

       if syncIsSet {
           instanceLocker.unlock()
           return instance! // should never fail
       }

       let instance = body()
       self.instance = instance

       syncIsSet = true
       instanceLocker.unlock()

       return instance
    }
}

The logic is to allow concurrent reads of isSet so the accesses to instance can be run in parallel from different threads. To avoid race conditions (that's the part I'm not 100% sure), I have two barriers. One when setting isSet and one when setting instance. The trick is to unlock the later only after isSet is set to true, so the threads waiting on for instanceLocker to be unlocked gets locked a second time on isSet while it's being asynchronously written on the concurrent dispatch queue.

I think I'm very close from a final solution here, but since I'm not a distributed system expert, I'd like to make sure.

Also, using a dispatch queue wasn't my first choice because it makes me think reading isSet isn't super efficient, but again, I'm no expert.

So my two questions are:

  • Is this 100% thread-safe and if not, why?
  • Is there a more efficient way of doing this in Swift?
Houseleek answered 10/6, 2018 at 0:45 Comment(4)
This is very over-complicated. A dispatch queue is absolutely the correct tool; NSLock is the wrong tool. I'm trying to work out what exactly the goal is, however. Is the goal here to make calls to body() be serialized? Is body() not promised to be reentrant? Or is there another goal. Forget about isSet; what is value supposed to do here? (There is almost never a correct use of NSLock in Swift, ever. NSLock stopped being the right tool when GCD was released, long before Swift was created.)Johnjohna
Double-check locking is also never the right tool in Swift. (It is very hard to get right in other languages, but there's just no reason to even try in Swift.) You're so close, too. Your code around isSet is exactly right (I just don't think you need isSet)Johnjohna
The goal here is to have a barrier when setting the value for the first time, then avoid this barrier once the value has been instantiated. I don't want the reads of instance to be blocking the thread knowing it's not necessary 99% of the time. If NSLock is wrong, I guess I could use a DispatchSemaphore instead. I can't dispatch body() on a queue because I want it to be called on the same thread than the caller.Houseleek
Why is double-check locking never the right tool in Swift? I don't see any other options since lazy var isn't thread safe yet, there's a ticket for that (bugs.swift.org/browse/SR-1042) which is still opened. And things like dispatch_once doesn't exist in Swift (I'm not sure I could have used it here anyway...)Houseleek
J
8

IMO, the correct tool here is os_unfair_lock. The point of double-check locking is to avoid the expense of a full kernel lock. os_unfair_lock provides that in the uncontested case. The "unfair" part of it is that it doesn't make promises to waiting threads. If one thread unlocks, it is allowed to relock without another waiting thread getting a chance (and thus could starve). In practice with a very small critical section, this is not relevant (in this case you're just checking a local variable against nil). It is a lower-level primitive than dispatching to a queue, which is very fast, but not as fast as an unfair_lock, since it relies on primitives like unfair_lock.

final class Builder<I> {

    private let body: () -> I
    private var lock = os_unfair_lock()

    init(body: @escaping () -> I) {
        self.body = body
    }

    private var _value: I!
    var value: I {
        os_unfair_lock_lock(&lock)
        if _value == nil {
            _value = body()
        }
        os_unfair_lock_unlock(&lock)

        return _value
    }
}

Note that you were correct to do synchronization on syncIsSet. If you'd treated it as a primitive (as is common in other double-check synchronizations), then you'd be relying on things that Swift doesn't promise (both the atomicy of writing Bools and that it would actually check the boolean twice, since there's no volatile). Given that you're doing synchronization, the comparison is between os_unfair_lock and dispatching to a queue.

This said, in my experience this kind of laziness is almost always unwarranted in mobile apps. It only actually saves you time if the variable is very expensive, but likely never accessed. Sometimes in massively parallel systems, being able to move the initialization is worthwhile, but mobile apps live on a fairly limited number of cores, so there generally isn't some extra core lying around to shunt this off to. I generally would not pursue this unless you've already discovered that it is a significant problem when your framework is used in live systems. If you have, then I recommend profiling your approach against os_unfair_lock in real-world usages that show this problem. I expect os_unfair_lock to win.

Johnjohna answered 10/6, 2018 at 2:34 Comment(9)
Thanks! This is very helpful. I didn't know about os_unfair_lock.Houseleek
I also replaced the NSLock with a DispatchSemaphore in my original solution, which is working well. Do you know how good a DispatchSemaphore is compared to an NSLock? From the documentation, it seems to be optimized, but it's not described in what way.Houseleek
In any case, your solution works well and seems to be the most optimized. I don't have a real-world usage that shows this problem so far, but since I'm building a library, I'm trying to deal the worst case scenario, since I don't know how it's gonna get used, I also don't know how heavy body() will be since it's provided by the library's user.Houseleek
Understood, but for a library this is probably exactly the wrong direction, because laziness has a non-trivial cost in the standard case (where it's the value is immediately used). If you don't need laziness, adding it makes performance worse. Better is to leave that kind of laziness to the caller rather than in the library until use cases accumulate that indicate you need to add an optional lazy solution that callers can use.Johnjohna
DispatchSemaphore is probably pretty close to NSLock because they're both non-recursive fair locks (I would expect NSLock is built on DispatchSemaphore these days, but I haven't looked at the code recently). Both are generally slower than os_unfair_lock, which is a very low-level primitive (and so should only be used in cases where you are certain you need it; it's not a general-purpose solution).Johnjohna
Thanks again for all these info!Houseleek
You're probably right, doing lazy loading a default behavior is the wrong direction if I want devs to have the opportunity to use my library in heavily multi-threaded programs. The underlying reason why I'm using lazy loading right now is that I'm building a dependency container with an API allowing to register dependency builders taking parameters, which means I can't build the dependency upfront since I don't have the parameters upfront. So I think what I'll do is provide an additional API to register instances directly, which will delegate the building part to the caller.Houseleek
I did't look at the code of NSLock or DispatchSemaphore, but, according to my understanding to semaphores, they are faster than locks because they internally employ conditional variables which help the waiting threads get re-scheduled more quickly.Theisen
To ensure your os_unfair_lock and other C locks are allocated on the heap and never moved or copied, you should allocate and initialize (and later deinitialize and deallocate) them with an UnsafeMutablePointer. Otherwise, they may stop working unexpectedly, crash in rare cases, and leak kernel resources, according to Philippe Hausler of the GCD team: forums.swift.org/t/atomic-property-wrapper-for-standard-library/…Hamann

© 2022 - 2024 — McMap. All rights reserved.