Intermittent & Temporary iOS Keychain Failure
Asked Answered
V

2

23

We have an app that heavily relies on being able to access a user's session token using iOS's Keychain. When our app opens, the first thing that's checked is whether the token is available - and if not, we show the user a login screen. We do not use any 3rd party library for this, and use Keychain's SecItemAdd() / SecItemCopyMatching() directly with the following options:

  • kSecClassGenericPassword
  • kSecAttrAccessibleAlwaysThisDeviceOnly

We see little to no issues with this during normal usage.

The Problem

We've had users reporting that upon opening their app, they're shown the login screen (suggesting Keychain could not find a value), when they were in fact logged in. We found that in this instance, we found that upon killing and relaunching the app, users were back to normal (session was found in Keychain). Once we found this, we tried added an exponential backoff to keep querying Keychain, thinking that it may had only been unavailable for the first few seconds. This did not work, and proved to us that Keychain seems to be unavailable for the entire app launch session. We are not able to reproduce this issue. It seems to happen very intermittently.

Upon further investigation, we found that commonly, users had received VoIP Pushes prior to having this issue. We're still not able to reproduce this issue reliably, but upon debugging, found that our Keychain.session was found to be nil when receiving these pushes (we rely on it there as well), which is before the user would open their app to see that they are pseudo- "logged out."

We use PushKit and PKPushRegistry in order to do VoIP Push. This requires our app to have "Background Modes: Voice over IP" enabled. We use all of this as suggested by Apple's documentation. Here's a sample:

func pushRegistry(_ registry: PKPushRegistry, didReceiveIncomingPushWith payload: PKPushPayload, for type: PKPushType) {
    guard let _ = Keychain.session else {
        print("Session token not available")
        return
    }
    handle(notification: payload.dictionaryPayload)
}

This code relies on Keychain.session being immediately available upon receiving a VoIP Push. As mentioned before, we've seen instances where it is not available, and so this function logs out and simply returns.

I've also found this relevant Apple forum post, suggesting that this could be an iOS bug.

Can anyone help us with this issue? Any help would be greatly appreciated. Even if this is in fact an iOS bug, we're open to workarounds.

Vacuva answered 13/8, 2018 at 21:30 Comment(4)
There was an occasion when every application on my phone had its keychain data deleted. This happened completely randomly, on a non-jailbroken phone, on the latest version of iOS (it was iOS 11.1 or 11.2). This wasn't a separated case either: it happened with my colleague as well. I don't think that there is a secure/easy solution for this, it's up to Apple to fix this awful issue.Nellanellda
@TamásSengel That's unfortunate, though I believe that our issues are different - in my case, Keychain data was not deleted, but rather unavailable for that app session. If the user killed their app and reopened, it was just fine.Vacuva
Did you find a fix @GPRyan? I am having the exact same issue.Staffordshire
@jesper87 Unfortunately, no :( However, we have heard a smaller percentage of reports of this issue, so it's possible that an iOS update has alleviated the issue.Vacuva
L
10

I think this topic is driving the iOS dev community literally crazy since long time. I've read so many articles and thread about this that I lost the count.

So, I'm dealing with this issue too, and based on the @naqi answer, I started moving in that direction and I might have found a working solution.

Assumptions:

The issue seems to happen when the app is getting suspended, due any sort of reason (eg. the app in background from long time and due memory pressure the OS send it in suspended state, or the app is in bg and user upgrades to a new version from the store or automatic app update does it). I'm no longer fully sure what suspended means, but I strongly suspect that the app may get resumed/awaken without actually getting presented on the screen (due bg task, but not necessarily). One Apple stuff member on the Apple developer forum, had the brave to reply saying something like "try to put a delay between addDidFinishLaunchingWithOptions and appDidBecomeActive". Apart from the madness here, reading that comment made me consider that maybe when the app gets resumed it triggers the addDidFinishLaunchingWithOptions but of course not the appDidBecomeActive (as it may not be resumed by the user but from any sort of event handled by the OS in background, for which we may not have the control seems, or any other feature like bg task or silent push notifications etc).

Extra info:

According with a quite old penetration test article I've found on the web (https://resources.infosecinstitute.com/iphone-penetration-testing-3/) querying the keychain actually would address the query to another process called securityd running on the device, which magically schedule the queries and performs them to the keychain db and, finally returns back the content to the requesting app, assuming that the app is still alive when the query gets completed on the keychain db. This made me think that, if that is still true, such process may fall in sort of "DDoS" attack where your app (or maybe all the apps in execution? Not clear how it actually works) performs too many queries. This may lead to false positive errors like secItemNotFound or similar. Again, as nobody confirmed this, keep it as my personal consideration.

Approach:

From the @Naqi answer, I could exclude the points 2,3 and 4, keeping the point 1 as only still valid reason. Said that, I decided to move any CRUD operation dealing with the keychain, from the addDidFinishLaunchingWithOptions to appDidBecomeActive, by handling the app setup instructions (whatever needed by your app) in there, in order to make sure that eventual background events of suspending/resuming actions won't affect the keychain at all as the appDidBecomeActive would not get called unless user actually makes the app to get opened and shown on the screen. Of course this should happen only on app launch, so you can use a bool flag or similar semaphore approach in appDidBecomeActive to make sure your setup logic is executed only once.

Combined with the above approach, In the App Delegate I also explicitly implemented application(shouldSaveApplicationState:) by returning false and application(shouldRestoreApplicationState: still returning false (Even though I assume by default should be already like this). This may not necessarily be useful but it was worth it to try too.

Summarising, on some affected devices which were randomly failing the keychain access, seems now not to happen anymore since weeks and, consider this was happening for some, almost every 2/3 days let's say, I assume it might have attenuated the issue at least.

I can say that the monitored devices were all in different configurations, with a lot of apps installed or not, with storage memory almost full or not, with a lot of apps in bg or none at all, any form factor and different iOS versions. Keychain backup to cloud not enabled, and items attribute on the keychain at least as kSecAttrAccessibleWhenUnlockedThisDeviceOnly

Open point not fully verified: Supposedly the access to keychain should be thread safe, but I found different thread on the web contradicting each other. Maybe that could somehow impact your logic and cause unexpected state

(Congrats if you managed to read up to here :) )

Hope this helps to further understand what's going on with this keychain tragedy in iOS

Loosing answered 7/8, 2019 at 18:53 Comment(0)
C
4

I was having similar issues and after talking to engineers at WWDC we identified few issues that is leading to this behavior.

  1. We were calling Keychain too often and this can lead to a situation where it gets expensive and some calls would not finish or timeout

  2. If an app does not provide Access Group for keychain or does not add it in save transaction with keychain then Xcode generates one but this is dynamic and can change between dev environments

  3. If you provided an access group later in the life of your app, then your old keychain value will remain associated with the app and a new value will be created with the access group you will now provided
  4. This also means that when you query keychain you have to provide an access group otherwise you will get a reference to what ever was found first.

Edit: formatting

Cheesecake answered 6/6, 2019 at 23:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.