Trying to Understand Asynchronous Operation Subclass
Asked Answered
P

3

33

I am trying to get started with using Operations in a side project rather than having closure-based callbacks littered throughout my networking code to help eliminate nested calls. So I was doing some reading on the subject, and I came across this implementation:

open class AsynchronousOperation: Operation {

    // MARK: - Properties

    private let stateQueue = DispatchQueue(label: "asynchronous.operation.state", attributes: .concurrent)

    private var rawState = OperationState.ready

    private dynamic var state: OperationState {
        get {
            return stateQueue.sync(execute: {
                rawState
            })
        }
        set {
            willChangeValue(forKey: "state")
            stateQueue.sync(flags: .barrier, execute: {
                rawState = newValue
            })
            didChangeValue(forKey: "state")
        }
    }

    public final override var isReady: Bool {
        return state == .ready && super.isReady
    }

    public final override var isExecuting: Bool {
        return state == .executing
    }

    public final override var isFinished: Bool {
        return state == .finished
    }

    public final override var isAsynchronous: Bool {
        return true
    }


    // MARK: - NSObject

    private dynamic class func keyPathsForValuesAffectingIsReady() -> Set<String> {
        return ["state"]
    }

    private dynamic class func keyPathsForValuesAffectingIsExecuting() -> Set<String> {
        return ["state"]
    }

    private dynamic class func keyPathsForValuesAffectingIsFinished() -> Set<String> {
        return ["state"]
    }


    // MARK: - Foundation.Operation

    public final override func start() {
        super.start()

        if isCancelled {
            finish()
            return
        }

        state = .executing
        execute()
    }


    // MARK: - Public

    /// Subclasses must implement this to perform their work and they must not call `super`. The default implementation of this function throws an exception.
    open func execute() {
        fatalError("Subclasses must implement `execute`.")
    }

    /// Call this function after any work is done or after a call to `cancel()` to move the operation into a completed state.
    public final func finish() {
        state = .finished
    }
}

@objc private enum OperationState: Int {

    case ready

    case executing

    case finished
}

There are some implementation details of this Operation subclass that I would like some help in understanding.

  1. What is the purpose of the stateQueue property? I see it being used by get and set of the state computed property, but I can't find any documentation that explains the sync:flags:execute and sync:execute methods that they use.

  2. What is the purpose of the three class methods in the NSObject section that return ["state"]? I don't see them being used anywhere. I found, in NSObject, class func keyPathsForValuesAffectingValue(forKey key: String) -> Set<String>, but that doesn't seem to help me understand why these methods are declared.

Plaque answered 22/4, 2017 at 15:42 Comment(5)
FYI, this has a fairly serious mistake. The start documentation says (emphasis added): "If you are implementing a concurrent operation, you must override this method and use it to initiate your operation. Your custom implementation must not call super at any time."Stroll
@Stroll The keyword in the quoted paragraph is "concurrent". This is relevant only for operations that do not run in an operation queue. See _isAsynchronous property and this paragraph: developer.apple.com/documentation/foundation/operation#1661231Jerrilyn
@Stroll None taken, but I invite you to take a closer look. I've been using operations for years. Even Apple's sample code in the famous Advanced Operations WWDC2015 talk overrides and calls super in start(). I do not say that sample code is perfect, but. I have the impression the paragraph that says you should not call super is what could be easily misunderstood.Jerrilyn
Furthermore, what Apple calls asynchronous, is an operation that spawns a new thread from start(). This is not the same as running operations in a background queue.Jerrilyn
Lol. I know what "asynchronous operation" means. Bottom line, the documentation couldn't be clearer about start implementations and they repeat the warning several times: "At no time in your start() method should you ever call super" and "Your custom implementation [of start] must not call super at any time." And I wouldn't hold out "Advanced NSOperations" as the exemplar because, as you allude, its problems are infamous and manifold. But, hey, if you want to call super, knock yourself out. But future readers should be forewarned, as it can cause problems if you're not careful.Stroll
S
64

You said:

  1. What is the purpose of the stateQueue property? I see it being used by get and set of the state computed property, but I can't find any documentation that explains the sync:flags:execute and sync:execute methods that they use.

This code "synchronizes" access to a property to make it thread safe. Regarding why you need to do that, see the Operation documentation, which advises:

Multicore Considerations

... When you subclass NSOperation, you must make sure that any overridden methods remain safe to call from multiple threads. If you implement custom methods in your subclass, such as custom data accessors, you must also make sure those methods are thread-safe. Thus, access to any data variables in the operation must be synchronized to prevent potential data corruption. For more information about synchronization, see Threading Programming Guide.

Regarding the exact use of this concurrent queue for synchronization, this is known as the "reader-writer" pattern. This basic concept of reader-writer pattern is that reads can happen concurrent with respect to each other (hence sync, with no barrier), but writes must never be performed concurrently with respect to any other access of that property (hence async with barrier).

For example, you might implement a reader-writer for thread-safety on an array like so:

class ThreadSafeArray<T> {
    private var values: [T]
    private let queue = DispatchQueue(label: "...", attributes: .concurrent)
    
    init(_ values: [T]) {
        self.values = values
    }
    
    func reader<U>(block: () throws -> U) rethrows -> U {
        return try queue.sync {
            try block()
        }
    }
    
    func writer(block: @escaping (inout [T]) -> Void) {
        queue.async(flags: .barrier) {
            block(&self.values)
        }
    }
    
    // e.g. you might use `reader` and `writer` like the following:
    
    subscript(_ index: Int) -> T {
        get { reader { values[index] } }
        set { writer { $0[index] = newValue } }
    }
    
    func append(_ value: T) {
        writer { $0.append(value) }
    }
    
    func remove(at index: Int) {
        writer { $0.remove(at: index)}
    }
}

Obviously, the use of reader-writer in this Operation subclass is even simpler, but the above illustrates the pattern.

You also asked:

  1. What is the purpose of the three class methods in the NSObject section that return ["state"]? I don't see them being used anywhere. I found, in NSObject, class func keyPathsForValuesAffectingValue(forKey key: String) -> Set<String>, but that doesn't seem to help me understand why these methods are declared.

These are just methods that ensure that changes to the state property trigger KVO notifications for properties isReady, isExecuting and isFinished. The KVO notifications of these three keys is critical for the correct functioning of asynchronous operations. Anyway, this syntax is outlined in the Key-Value Observing Programming Guide: Registering Dependent Keys.

The keyPathsForValuesAffectingValue method you found is related. You can either register dependent keys using that method, or have the individual methods as shown in your original code snippet.


BTW, here is a revised version of the AsynchronousOperation class you provided, namely:

  1. You must not call super.start(). As the start documentation says (emphasis added):

    If you are implementing a concurrent operation, you must override this method and use it to initiate your operation. Your custom implementation must not call super at any time.

  2. Add @objc required in Swift 4.

  3. Renamed execute to use main, which is the convention for Operation subclasses.

  4. It is inappropriate to declare isReady as a final property. Any subclass should have the right to further refine its isReady logic (though we admittedly rarely do so).

  5. Use #keyPath to make code a little more safe/robust.

  6. You don't need to do manual KVO notifications when using dynamic property. The manual calling of willChangeValue and didChangeValue is not needed in this example.

  7. Change finish so that it only moves to .finished state if not already finished.

Thus:

public class AsynchronousOperation: Operation {
    
    /// State for this operation.
    
    @objc private enum OperationState: Int {
        case ready
        case executing
        case finished
    }
    
    /// Concurrent queue for synchronizing access to `state`.
    
    private let stateQueue = DispatchQueue(label: Bundle.main.bundleIdentifier! + ".rw.state", attributes: .concurrent)
    
    /// Private backing stored property for `state`.
    
    private var _state: OperationState = .ready
    
    /// The state of the operation
    
    @objc private dynamic var state: OperationState {
        get { return stateQueue.sync { _state } }
        set { stateQueue.async(flags: .barrier) { self._state = newValue } }
    }
    
    // MARK: - Various `Operation` properties
    
    open         override var isReady:        Bool { return state == .ready && super.isReady }
    public final override var isExecuting:    Bool { return state == .executing }
    public final override var isFinished:     Bool { return state == .finished }
    public final override var isAsynchronous: Bool { return true }

    // KVN for dependent properties
    
    open override class func keyPathsForValuesAffectingValue(forKey key: String) -> Set<String> {
        if ["isReady", "isFinished", "isExecuting"].contains(key) {
            return [#keyPath(state)]
        }
        
        return super.keyPathsForValuesAffectingValue(forKey: key)
    }
    
    // Start
    
    public final override func start() {
        if isCancelled {
            state = .finished
            return
        }
        
        state = .executing
        
        main()
    }
    
    /// Subclasses must implement this to perform their work and they must not call `super`. The default implementation of this function throws an exception.
    
    open override func main() {
        fatalError("Subclasses must implement `main`.")
    }
    
    /// Call this function to finish an operation that is currently executing
    
    public final func finish() {
        if !isFinished { state = .finished }
    }
}
Stroll answered 4/1, 2018 at 21:50 Comment(18)
Nice answer, except based on the documentation, for asynchronous operations one should override start instead of main (main is for synchronous operations). IMHO It would help to avoid confusion if you can remove the main overrideHarleigh
I disagree. The Concurrency Programming Guide encourages use of both start and main for asynchronous operations. In their discussion about main in conjunction with asynchronous operations they say, "Although you could perform the task in the start method, implementing the task using [main] can result in a cleaner separation of your setup and task code."Stroll
You are correct, thanks I was doing it in start for asynchronous operations but this is cleaner.Harleigh
@Stroll Thanks for sharing the revised version of the asynchronous operation. But as the programming guide states also that main is optional for concurrent operations in practice it makes no difference to put the task code into main or (custom) execute. The general workflow is the same.Chenee
@Chenee - Agreed. As I said in my answer, I renamed execute to main merely because it was a matter of convention; I didn't do it because I thought one had to. It just struck me as really strange to introduce a completely different method name when it's so common (and explicitly contemplated by Apple) to use main for this purpose. My comment above was just in reaction to user1046037's assumption that main was only for synchronous operations. I wonder if a similar misconception drove calebd's decision to use execute in the first place...Stroll
@Stroll I'm using that class in a few projects and personally I like the concept of execute and finish as a differentiation to (implicitly called) synchronous main and to reflect isExecuting / isFinishing and the corresponding states.Chenee
@Chenee - Fine. I think that's a little weird given that Apple explicitly discusses using main for this purpose in the Concurrency Programming Guide: Configuring Operations for Concurrent Execution and in the documentation for main, but to each his own. But it doesn't really matter.Stroll
@Stroll Could you have a look at another Operation problem I am facing in #48138396Harleigh
@Rob: What does "rw" stands for in the label name of the queue? Read/Write?Girlie
@VinGazoil - It stands for “reader-writer”, a synchronization pattern in which you use a concurrent queue, allowing concurrent reads, but using barrier for writes to ensure writes are properly synchronized.Stroll
@Stroll - Thank you for this precision.Girlie
Amazing answer! Thank you.Rodrigues
@Stroll Thanks for answering my other question. I'm still a bit confused about the start() method here. I subclassed the AsynchronousOperation class and created an operation for uploading files to S3. Do I manually have to call the start() method to begin the operation? Or does the OperationQueue take care of that?Denticle
@Denticle - Perhaps you can post to somewhere like gist.github.com, where I can post comments/corrections? In answer to your question, the operation queue will call start for you. But your main implementation isn’t quite right...Stroll
@Stroll Do you think it's possible to perform the thread safety with the new concurrency API of Swift 5.5?Chenee
@Chenee We'd just make it an actor and you're done. It's an extremely elegant solution and very simple. I'm already adopting actors and async-await in my apps and love the elegance of the code. (Earlier I was worried about how slow it was, but that looks like a broader performance difference between concurrentPerform and async-await task groups, not the actor.)Stroll
@Stroll Thanks, with using actors you mean to replace the entire Operation subclass with an actor and OperationQueue with a TaskGroup or an AsyncSequence pattern.Chenee
As it relates to this question, where doing async-await (which I’d prefer to operations in most use-cases), ensuring thread-safety with actors is definitely the way to go. In those rare cases where I might still use Operation, I still stick with traditional synchronization mechanisms. (I personally do not use reader-writer, though. Other approaches are more performant and just as elegant.)Stroll
K
8

When using an updated code snippet from Rob's answer, one should be aware of possibility of a bug, caused by this change:

  1. Change finish so that it only moves to .finished state if isExecuting.

The above goes against Apple docs:

In addition to simply exiting when an operation is cancelled, it is also important that you move a cancelled operation to the appropriate final state. Specifically, if you manage the values for the finished and executing properties yourself (perhaps because you are implementing a concurrent operation), you must update those properties accordingly. Specifically, you must change the value returned by finished to YES and the value returned by executing to NO. You must make these changes even if the operation was cancelled before it started executing.

This will cause a bug in a few cases. For example, if Operation Queue with "maxConcurrentOperationCount = 1" gets 3 async operations A B and C, then if all operations are cancelled during A, C will not get executed and the queue will be stuck on operation B.

Karachi answered 1/5, 2019 at 18:22 Comment(2)
so what's the solution here?Hopehopeful
Good catch. I’ve fixed this in my answer. Thanks.Stroll
L
3

About your first question: stateQueue lock your operation when writing a new value to you operation state by:

    return stateQueue.sync(execute: {
            rawState
    })

And

    stateQueue.sync(flags: .barrier, execute: {
        rawState = newValue
    })

as your operation is asynchronous so before read or write one state another state can be called. Like you want to write isExecution but in the mean time isFinished already called. So to avoid this scenario stateQueue lock the operation state to be read and write until it finished its previous call. Its work like Atomic. Rather use dispatch queue you can use an extension to NSLock to simplify executing critical code from Advanced NSOperations sample code in WWDC 2015 https://developer.apple.com/videos/play/wwdc2015/226/ from https://developer.apple.com/sample-code/wwdc/2015/downloads/Advanced-NSOperations.zip and you can implement like following:

private let stateLock = NSLock()

private dynamic var state: OperationState {
    get {
        return stateLock.withCriticalScope{ rawState } 
    }
    set {
        willChangeValue(forKey: "state")

        stateLock.withCriticalScope { 
            rawState = newValue
        }
        didChangeValue(forKey: "state")
    }
}

About your second question: Its a KVO notification for the read only property isReady, isExecuting, isFinished to manage the operation state. You can read this: http://nshipster.com/key-value-observing post till the end for better understanding about KVO.

Lopeared answered 3/5, 2017 at 13:29 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.