Where should NSManagedObjectContext be created?
Asked Answered
T

2

3

I've recently been learning about Core Data and specifically how to do inserts with a large number of objects. After learning how to do this and solving a memory leak problem that I met, I wrote the Q&A Memory leak with large Core Data batch insert in Swift.

After changing NSManagedObjectContext from a class property to a local variable and also saving inserts in batches rather than one at a time, it worked a lot better. The memory problem cleared up and the speed improved.

The code I posted in my answer was

let batchSize = 1000

// do some sort of loop for each batch of data to insert
while (thereAreStillMoreObjectsToAdd) {

    // get the Managed Object Context
    let managedObjectContext = (UIApplication.sharedApplication().delegate as! AppDelegate).managedObjectContext
    managedObjectContext.undoManager = nil // if you don't need to undo anything

    // get the next 1000 or so data items that you want to insert
    let array = nextBatch(batchSize) // your own implementation

    // insert everything in this batch
    for item in array {

        // parse the array item or do whatever you need to get the entity attributes for the new object you are going to insert
        // ...

        // insert the new object
        let newObject = NSEntityDescription.insertNewObjectForEntityForName("MyEntity", inManagedObjectContext: managedObjectContext) as! MyManagedObject
        newObject.attribute1 = item.whatever
        newObject.attribute2 = item.whoever
        newObject.attribute3 = item.whenever
    }

    // save the context
    do {
        try managedObjectContext.save()
    } catch {
        print(error)
    }
}

This method seems to be working well for me. The reason I am asking a question here, though, is two people (who know a lot more about iOS than I do) made comments that I don't understand.

@Mundi said:

It seems in your code you are using the same managed object context, not a new one.

@MartinR also said:

... the "usual" implementation is a lazy property which creates the context once for the lifetime of the app. In that case you are reusing the same context as Mundi said.

Now I don't understand. Are they saying I am using the same managed object context or I should use the same managed object context? If I am using the same one, how is it that I create a new one on each while loop? Or if I should be using just one global context, how do I do it without causing memory leaks?

Previously, I had declared the context in my View Controller, initialized it in viewDidLoad, passed it as a parameter to my utility class doing the inserts, and just used it for everything. After discovering the big memory leak is when I started just creating the context locally.

One of the other reasons I started creating the contexts locally is because the documentation said:

First, you should typically create a separate managed object context for the import, and set its undo manager to nil. (Contexts are not particularly expensive to create, so if you cache your persistent store coordinator you can use different contexts for different working sets or distinct operations.)

What is the standard way to use NSManagedObjectContext?

Tut answered 17/8, 2015 at 4:6 Comment(3)
You can put the let managedObjectContext and managedObjectContext.save() outside the loops, it doesn't require to be called for each loop. And too NSManagedObjectContext should be created once for all your Database operations, thats why they use AppDelegate for its creation.Rivet
What about saving in batches? If I have 100,000 objects to insert, should I still only call save once?Tut
In that case, you might create your logic to call save with a limit of 5000 for example, but don't call it everytime, it will create overhead.Rivet
S
4

Now I don't understand. Are they saying I am using the same managed object context or I should use the same managed object context? If I am using the same one, how is it that I create a new one on each while loop? Or if I should be using just one global context, how do I do it without causing memory leaks?

Let's look at the first part of your code...

while (thereAreStillMoreObjectsToAdd) {
    let managedObjectContext = (UIApplication.sharedApplication().delegate as! AppDelegate).managedObjectContext
    managedObjectContext.undoManager = nil

Now, since it appears you are keeping your MOC in the App Delegate, it's likely that you are using the template-generated Core Data access code. Even if you are not, it is highly unlikely that your managedObjectContext access method is returning a new MOC each time it is called.

Your managedObjectContext variable is merely a reference to the MOC that is living in the App Delegate. Thus, each time through the loop, you are merely making a copy of the reference. The object being referenced is the exact same object each time through the loop.

Thus, I think they are saying that you are not using separate contexts, and I think they are right. Instead, you are using a new reference to the same context each time through the loop.


Now, your next set of questions have to do with performance. Your other post references some good content. Go back and look at it again.

What they are saying is that if you want to do a big import, you should create a separate context, specifically for the import (Objective C since I have not yet made time to learn Swift).

NSManagedObjectContext moc = [[NSManagedObjectContext alloc]
    initWithConcurrencyType:NSPrivateQueueConcurrencyType];

You would then attach that MOC to the Persistent Store Coordinator. Using performBlock you would then, in a separate thread, import your objects.

The batching concept is correct. You should keep that. However, you should wrap each batch in an auto release pool. I know you can do it in swift... I'm just not sure if this is the exact syntax, but I think it's close...

autoreleasepool {
    for item in array {
        let newObject = NSEntityDescription.insertNewObjectForEntityForName ...
        newObject.attribute1 = item.whatever
        newObject.attribute2 = item.whoever
        newObject.attribute3 = item.whenever
    }
}

In pseudo-code, it would all look something like this...

moc = createNewMOCWithPrivateQueueConcurrencyAndAttachDirectlyToPSC()
moc.performBlock {
    while(true) {
        autoreleasepool {
            objects = getNextBatchOfObjects()
            if (!objects) { break }
            foreach (obj : objects) {
                insertObjectIntoMoc(obj, moc)
            }
        }
        moc.save()
        moc.reset()
    }
}

If someone wants to turn that pseudo-code into swift, it's fine by me.

The autorelease pool ensures that any objects autoreleased as a result of creating your new objects are released at the end of each batch. Once the objects are released, the MOC should have the only reference to objects in the MOC, and once the save happens, the MOC should be empty.

The trick is to make sure that all object created as part of the batch (including those representing the imported data and the managed objects themselves) are all created inside the autorelease pool.

If you do other stuff, like fetching to check for duplicates, or have complex relationships, then it is possible that the MOC may not be entirely empty.

Thus, you may want to add the swift equivalent of [moc reset] after the save to ensure that the MOC is indeed empty.

Semi answered 17/8, 2015 at 23:41 Comment(1)
In my own code I believe I have been able to implement all you your suggestion successfully now. I also added another answer to this question that expresses your pseudocode in Swift syntax. If you see any problems with it, let me know.Tut
T
2

This is a supplemental answer to @JodyHagins' answer. I am providing a Swift implementation of the pseudocode that was provided there.

let managedObjectContext = NSManagedObjectContext(concurrencyType: NSManagedObjectContextConcurrencyType.PrivateQueueConcurrencyType)
managedObjectContext.persistentStoreCoordinator = (UIApplication.sharedApplication().delegate as! AppDelegate).persistentStoreCoordinator // or wherever your coordinator is

managedObjectContext.performBlock { // runs asynchronously

    while(true) { // loop through each batch of inserts

        autoreleasepool {
            let array: Array<MyManagedObject>? = getNextBatchOfObjects()
            if array == nil { break }
            for item in array! {
                let newEntityObject = NSEntityDescription.insertNewObjectForEntityForName("MyEntity", inManagedObjectContext: managedObjectContext) as! MyManagedObject
                newObject.attribute1 = item.whatever
                newObject.attribute2 = item.whoever
                newObject.attribute3 = item.whenever
            }
        }

        // only save once per batch insert
        do {
            try managedObjectContext.save()
        } catch {
            print(error)
        }

        managedObjectContext.reset()
    }
}

These are some more resources that helped me to further understand how the Core Data stack works:

Tut answered 20/8, 2015 at 3:41 Comment(1)
Even though you are saving in batches, you can still get lock contention with other readers, because you share the same PSC. However, if you were to create a separate PSC for the batch upload, you can let SQL do the locking. On iOS7 and above, you automatically get WAL, which means you can have N readers and 1 writer accessing the SQL database at the same time. Thus, your readers are not locked out while your writer is saving to disk.Semi

© 2022 - 2024 — McMap. All rights reserved.