xCode 7.0 IOS9 SDK: deadlock while executing fetch request with performBlockAndWait
Asked Answered
A

2

6

Updated: I have prepared the sample which is reproduce the issue without magical record.Please download the test project using following URL: https://www.dsr-company.com/fm.php?Download=1&FileToDL=DeadLockTest_CoreDataWithoutMR.zip

The provided project has following problem: deadlock on fetch in performBlockAndWait called from main thread.

The issue is reproduced if code is compiled using XCode version > 6.4. The issue is not reproduced if code is compiled using xCode == 6.4.

Old question was:

I am working on the support of IOS mobile application. After the recent update of Xcode IDE from version 6.4 to version 7.0 ( with IOS 9 support ) I have faced with critical issue - application hangup. The same build of the application ( produced from the same sources ) with xCode 6.4 works OK. So, if the application is built using xCode > 6.4 - application hangs up on some cases. if the application is built using xCode 6.4 - application works OK.

I have spent some time to research the issue and as the result I have prepared the test application with similar case like in my application which reproduces the problem. The test application hangup on the Xcode >= 7.0 but works correctly on the Xcode 6.4

Download link of test sources: https://www.sendspace.com/file/r07cln

The requirements for the test application is: 1. cocoa pods manager must be installed in the system 2. MagicalRecord framework of version 2.2.

Test application works in the following way: 1. At the start of the application it creates test database with 10000 records of simple entities and saves them to persistent store. 2. At the first screen of the application in the method viewWillAppear: it runs the test which causes deadlock. Following algorithm is used:

-(NSArray *) entityWithId: (int) entityId inContext:(NSManagedObjectContext *)localContext 
{
   NSArray * results = [TestEntity MR_findByAttribute:@"id" withValue:[ NSNumber numberWithInt: entityId ] inContext:localContext];
  return results;
}

…..
int entityId = 88;
NSManagedObjectContext *childContext1 = [NSManagedObjectContext MR_context];
childContext1.name = @"childContext1";

NSManagedObjectContext *childContext2 = [NSManagedObjectContext MR_context];
childContext2.name = @"childContext2";

NSArray *results = [self entityWithId:entityId inContext: childContext2];

for(TestEntity *d in results)
{
    NSLog(@"e from fetchRequest %@ with name = '%@'", d,  d.name); /// this line is the reason of the hangup
}

dispatch_async(dispatch_get_main_queue(), ^
               {
                   int entityId2 = 11;
                   NSPredicate *predicate2 = [NSPredicate predicateWithFormat:@"id=%d", entityId2];
                   NSArray *a = [ TestEntity MR_findAllWithPredicate: predicate2 inContext: childContext2];
                   for(TestEntity *d in a)
                   {
                       NSLog(@"e from fetchRequest %@ with name = '%@'", d,  d.name);
                   }
               });

Two managed object contexts are created with concurrency type == NSPrivateQueueConcurrencyType (please check the code of MR_context of magical record framework). Both contexts has parent context with concurrency type = NSMainQueueConcurrencyType. From the main thread application performs fetch in sync manner ( MR_findByAttribute and MR_findAllWithPredicate are used performBlockAndWait with fetch request inside ). After the first fetch the second fetch is schedule on the main thread using dispatch_async().

As a result the application hangs up. It seems that deadlock has happened, please check the screenshot of the stack:

 here is the link, my reputation is too low to post images. https://cdn.img42.com/34a8869bd8a5587222f9903e50b762f9.png)

If to comment the line
NSLog(@"e from fetchRequest %@ with name = '%@'", d, d.name); /// this line is the reason of the hangup

(which is the line 39 in ViewController.m of the test project ) the application becomes working OK. I believe this is because there is no read of name field of the test entity.

So with the commented line NSLog(@"e from fetchRequest %@ with name = '%@'", d, d.name);
there is no hangup on binaries built both with Xcode 6.4 and Xcode 7.0.

With the uncommented line NSLog(@"e from fetchRequest %@ with name = '%@'", d, d.name);

there is hangup on binary built with Xcode 7.0 and there is no hangup on binary built with Xcode 6.4.

I believe the issue is happens because of lazy-loading of entity data.

Has anybody problem with the described case? I will be grateful for any help.

Acree answered 1/10, 2015 at 12:43 Comment(1)
I believe there is a problem or incompatibility in the recent iOS 9 SDKs, but to help people figure that out let's make your post more clear. First of all, could you upload your example to the github? (Chrome didn't allowed me to download). The second: remove all changes in the pods made by you. Remove unused code, make example clear. The short story of the problem is: dispatching block with a child context will deadlock on it if 1) dispatched from -viewWillAppear: and 2) a managed object was used (fired fault) before dispatch (like you did in NSLog: d.name) The same problem with direct CoreDaHylton
V
9

This is why I don't use frameworks that abstract (i.e., hide) too many details of core data. It has very complex use patterns, and sometimes you need to know the details of how they interoperate.

First, I know nothing about magical record except that lots of people use it so it must be pretty good at what it does.

However, I immediately saw several completely wrong uses of core data concurrency in your examples, so I went and looked at the header files to see why your code made the assumptions that it does.

I don't mean to bash you at all, though this may seem like it at first blush. I want to help educate you (and I used this as an opportunity to take a peek at MR).

From a very quick look at MR, I'd say you have some misunderstandings of what MR does, and also core data's general concurrency rules.

First, you say this...

Two managed object contexts are created with concurrency type == NSPrivateQueueConcurrencyType (please check the code of MR_context of magical record framework). Both contexts has parent context with concurrency type = NSMainQueueConcurrencyType.

which does not appear to be true. The two new contexts are, indeed, private-queue contexts, but their parent (according to the code I glanced at on github) is the magical MR_rootSavingContext, which itself is also a private-queue context.

Let's break down your code example.

NSManagedObjectContext *childContext1 = [NSManagedObjectContext MR_context];
childContext1.name = @"childContext1";

NSManagedObjectContext *childContext2 = [NSManagedObjectContext MR_context];
childContext2.name = @"childContext2";

So, you now have two private-queue MOCs (childContext1 and childContext2), both children of another anonymous private-queue MOC (we will call savingContext).

NSArray *results = [self entityWithId:entityId inContext: childContext2];

You then perform a fetch on childContext1. That code is actually...

-(NSArray *) entityWithId:(int)entityId
                inContext:(NSManagedObjectContext *)localContext 
{
   NSArray * results = [TestEntity MR_findByAttribute:@"id"
                                            withValue:[NSNumber numberWithInt:entityId]
                                            inContext:localContext];
  return results;
}

Now, we know that the localContext in this method is, in this case, another pointer to childContext2 which is a private-queue MOC. It is 100% against the concurrency rules to access a private-queue MOC outside of a call to performBlock. However, since you are using another API, and the method name offers no assistance to know how the MOC is being accessed, we need to go look at that API and see if it hides the performBlock to see if you are accessing it correctly.

Unfortunately, the documentation in the header file offers no indication, so we have to look at the implementation. That call ends up calling MR_executeFetchRequest... which does not indicate in the documentation how it handles the concurrency either. So, we go look at its implementation.

Now, we are getting somewhere. This function does try to safely access the MOC, but it uses performBlockAndWait which will block when it is called.

This is an extremely important piece of information, because calling this from the wrong place can indeed cause a deadlock. Thus, you must be keenly aware that performBlockAndWait is being called anytime you execute a fetch request. My own personal rule is to never use performBlockAndWait unless there is absolutely no other option.

However, this call here should be completely safe... assuming it is not being called from within the context of the parent MOC.

So, let's look at the next piece of code.

for(TestEntity *d in results)
{
    NSLog(@"e from fetchRequest %@ with name = '%@'", d,  d.name); /// this line is the reason of the hangup
}

Now, this is not the fault of MagicalRecord, because MR isn't even being used directly here. However, you have been trained to use those MR_ methods, which require no knowledge of the concurrency model, so you either forget or never learn the concurrency rules.

The objects in the results array are all managed objects that live in the childContext2 private-queue context. Thus, you may not ever access them without paying homage to the concurrency rules. This is a clear violation of the concurrency rules. While developing your application, you should enable concurrency debugging with the argument -com.apple.CoreData.ConcurrencyDebug 1.

This code snippet must be wrapped in either performBlock or performBlockAndWait. I hardly ever use performBlockAndWait for anything because it has so many drawbacks - deadlocks being one of them. In fact, just seeing the use of performBlockAndWait is a very strong indication that your deadlock is happening in there and not on the line of code that you indicate. However, in this case, it is at least as safe as the previous fetch, so let's make it a bit safer...

[childContext2 performBlockAndWait:^{
    for (TestEntity *d in results) {
        NSLog(@"e from fetchRequest %@ with name = '%@'", d,  d.name);
    }
}];

Next, you dispatch to the main thread. Is that because you just want something to occur on a subsequent event loop cycle, or is it because this code is already running on some other thread? Who knows. However, you have the same problem here (I reformatted your code for readability as a post).

dispatch_async(dispatch_get_main_queue(), ^{
    int entityId2 = 11;
    NSPredicate *predicate2 = [NSPredicate predicateWithFormat:@"id=%d", entityId2];
    NSArray *a = [TestEntity MR_findAllWithPredicate:predicate2
                                           inContext:childContext2];
    for (TestEntity *d in a) {
        NSLog(@"e from fetchRequest %@ with name = '%@'", d,  d.name);
    }
});

Now, we know that code starts out running on the main thread, and the search will call performBlockAndWait but your subsequent access in the for-loop again violates the core data concurrency rules.

Based on that, the only real problems I see are...

  1. MR seems to honor the core data concurrency rules within their API, but you must still follow the core data concurrency rules when accessing your managed objects.

  2. I really don't like the use of performBlockAndWait as it's just a problem waiting to happen.

Now, let's take a look at the screenshot of your hang. Hmmm... it's a classic deadlock, but it makes no sense because the deadlock happens between the main thread and the MOC thread. That can only happen if the main-queue MOC is a parent of this private-queue MOC, but the code shows that is not the case.

Hmmm... it didn't make sense, so I downloaded your project, and looked at the source code in the pod you uploaded. Now, that version of the code uses the MR_defaultContext as the parent of all MOCs created with MR_context. So, the default MOC is, indeed, a main-queue MOC, and now it all makes perfect sense.

You have a MOC as a child of a main-queue MOC. When you dispatch that block to the main queue, it's is now running as a block on the main queue. The code then calls performBlockAndWait on a context that is a child of a MOC for that queue, which is a huge no-no, and your are almost guaranteed to get a deadlock.

So, it seems that MR has since changed their code from using a main-queue as the parent of new contexts to using a private-queue as the parent of new contexts (most likely due to this exact problem). So, if you upgrade to the latest version of MR you should be fine.

However, I would still warn you that if you want to use MR in multithreading ways, you must know exactly how they handle the concurrency rules, and you must also make sure you obey them anytime you are accessing any core-data objects that are not going through the MR API.

Finally, I'll just say that I've done tons and tons of core data stuff, and I've never used an API that tries to hide the concurrency issues from me. The reason is that there are too many little corner cases, and I'd rather just deal with them in a pragmatic way up front.

Finally, you should almost never use performBlockAndWait unless you know exactly why its the only option. Having it be used as part of an API underneath you is even more scary... to me at least.

I hope this little jaunt has enlightened and helped you (and possible some others). It certainly shed a little bit of light for me, and helped reestablish some of my previous unfounded skittishness.

Edit

This is in response to the "non-magical-record" example you provided.

The problem with this code is the exact same problem I described above, relative to what was happening with MR.

You have a private-queue context, as a child to a main-queue context.

You are running code on the main queue, and you call performBlockAndWait on the child context, which has to then lock its parent context as it tries to execute the fetch.

It is called a deadlock, but the more descriptive (and seductive) term is deadly embrace.

The original code is running on the main thread. It calls into a child context to do something, and it does nothing else until that child complete.

That child then, in order to complete, needs the main thread to do something. However, the main thread can't do anything until the child is done... but the child is waiting for the main thread to do something...

Neither one can make any headway.

The problem you are facing is very well documented, and in fact, has been mentioned a number of times in WWDC presentations and multiple pieces of documentation.

You should NEVER call performBlockAndWait on a child context.

The fact that you got away with it in the past is just a "happenstance" because it's not supposed to work that way at all.

In reality, you should hardly every call performBlockAndWait.

You should really get used to doing asynchronous programming. Here is how I would recommend you rewrite this test, and whatever it is like that prompted this issue.

First, rewrite the fetch so it works asynchronously...

- (void)executeFetchRequest:(NSFetchRequest *)request
                  inContext:(NSManagedObjectContext *)context
                 completion:(void(^)(NSArray *results, NSError *error))completion
{
    [context performBlock:^{
        NSError *error = nil;
        NSArray *results = [context executeFetchRequest:request error:&error];
        if (completion) {
            completion(results, error);
        }
    }];
}

Then, you change you code that calls the fetch to do something like this...

NSFetchRequest *request = [[NSFetchRequest alloc] init];
[request setEntity: testEntityDescription ];
[request setPredicate: predicate2 ];
[self executeFetchRequest:request
                inContext:childContext2
               completion:^(NSArray *results, NSError *error) {
    if (results) {
        for (TestEntity *d in results) {
            NSLog(@"++++++++++ e from fetchRequest %@ with name = '%@'", d,  d.name);
        }
    } else {
        NSLog(@"Handle this error: %@", error);
    }
}];
Varix answered 27/10, 2015 at 20:17 Comment(5)
Here is the URL for the sample which is reproduce the issue without magical record. dsr-company.com/… The provided project has following problem: deadlock on fetch in performBlockAndWait called from main thread. The issue is reproduced if code is compiled using XCode version > 6.4. The issue is not reproduced if code is compiled using xCode == 6.4. Does anybody have some ideas?Acree
@KirillNeznamov - That code is doing the exact same thing I described that the version of MR you were using is doing. See the edit for more details.Varix
Thanks Jody, I will try to find the notes in the official apple documentation regarding the answer. Thank you very much!Acree
Excellent analysis! Regarding the two points in "the only real problems I see are..." a totally agree, but FWIW I would add another one: per default, a library should dispatch client provided blocks (or closures) to private queues, and it should not set default values for queues or threads to the main thread as well - unless explicitly set by the client otherwise.Mammy
@Mammy - I agree. Please feel free to edit the answer to add your thoughts in this regard.Varix
C
1

We switched over to XCode7 and I just ran into a similar deadlock issue with performBlockAndWait in code that works fine in XCode6.

The issue seems to be an upstream use of dispatch_async(mainQueue, ^{ ... to pass back the result from a network operation. That call was no longer needed after we added concurrency support for CoreData, but somehow it was left and never seemed to cause a problem until now.

It's possible that Apple changed something behind the scenes to make potential deadlocks more explicit.

Convocation answered 22/12, 2015 at 20:50 Comment(1)
I have reported the problem to the Apple support, they said that it is bug. I have sent to them the modified sources which are reproduced the problem without magical record framework. So we should wait until Apple fixes it. BTW, they said that issue may be resolved using reset() method of NSManagedContextAcree

© 2022 - 2024 — McMap. All rights reserved.