My iOS app has had intermittent SSL errors when making HTTPS requests to the backend for several months.
The error description:
An SSL error has occurred and a secure connection to the server cannot be made.
The console logs when in debug mode:
2019-07-06 15:12:37.012198+0100 MyApp[37255:12499941] [BoringSSL] nw_protocol_boringssl_input_finished(1543) [C2.1:2][0x159e8e4a0] Peer disconnected during the middle of a handshake. Sending errSSLClosedNoNotify(-9816) alert
2019-07-06 15:12:37.026641+0100 MyApp[37255:12499941] TIC TCP Conn Failed [2:0x280486d00]: 3:-9816 Err(-9816)
2019-07-06 15:12:37.027759+0100 MyApp[37255:12499941] NSURLSession/NSURLConnection HTTP load failed (kCFStreamErrorDomainSSL, -9816)
2019-07-06 15:12:37.027839+0100 MyApp[37255:12499941] Task <D5AF17C0-C202-4229-BD52-690EFDB10379>.<1> HTTP load failed (error code: -1200 [3:-9816])
2019-07-06 15:12:37.028016+0100 MyApp[37255:12499941] Task <D5AF17C0-C202-4229-BD52-690EFDB10379>.<1> finished with error - code: -1200
2019-07-06 15:12:37.032759+0100 MyApp[37255:12500041] Task <D5AF17C0-C202-4229-BD52-690EFDB10379>.<1> load failed with error Error Domain=NSURLErrorDomain Code=-1200 "An SSL error has occurred and a secure connection to the server cannot be made." UserInfo={NSErrorFailingURLStringKey=https://api.example.com/v1/example/example?param=example, NSLocalizedRecoverySuggestion=Would you like to connect to the server anyway?, _kCFStreamErrorDomainKey=3, _NSURLErrorFailingURLSessionTaskErrorKey=LocalDataTask <D5AF17C0-C202-4229-BD52-690EFDB10379>.<1>, _NSURLErrorRelatedURLSessionTaskErrorKey=(
"LocalDataTask <D5AF17C0-C202-4229-BD52-690EFDB10379>.<1>"
), NSLocalizedDescription=An SSL error has occurred and a secure connection to the server cannot be made., NSErrorFailingURLKey=https://api.example.com/v1/example/example?param=example, NSUnderlyingError=0x283ff2160 {Error Domain=kCFErrorDomainCFNetwork Code=-1200 "(null)" UserInfo={_kCFStreamPropertySSLClientCertificateState=0, _kCFNetworkCFStreamSSLErrorOriginalValue=-9816, _kCFStreamErrorDomainKey=3, _kCFStreamErrorCodeKey=-9816}}, _kCFStreamErrorCodeKey=-9816} [-1200]
The error occurs mainly on 3G/4G, not wifi, and occurs more often when the network signal is low. If it happens once it will keep happening for the next few requests, but will eventually work again shortly thereafter.
Based on the analytics, user reviews, and user bug reports: it is affecting a large percentage of users, but not 100% of them.
-
The backend is hosted on AWS Elastic Beanstalk. Served as a Docker app, using an Nginx proxy server, and multiple instances behind a load balancer.
I've tried increasing and decreasing the instance sizes and it seemed to make no difference.
I recently made an entirely new Elastic Beanstalk environment from scratch, to see if that helped. Previously it was using the Classic Load Balancer, now it is using the Application Load Balancer. Early indications are it has reduced the number of SSL errors, but they are still occurring.
The new load balancer is using this SSL policy:
ELBSecurityPolicy-FS-2018-06
Which is defined here: https://docs.aws.amazon.com/elasticloadbalancing/latest/application/create-https-listener.html
Should it be using a different SSL policy?
-
In the app the web requests were being made using URLSession.shared.dataTask...
etc. And I've also tried using the Alamofire library to see if that made a difference. It did not.
I feel like this may have something to do with Apple's App Transport Security. However, as it fails intermittently I'm at a loss as to how.
The relevant Apple docs are the bottom of this page: https://developer.apple.com/security/
If you need more information to help debug please let me know.
-
UPDATE:
So after trying many of the suggestions (thank you to everyone who contributed!) - and learning a lot more about SSL, load balancers, etc. - I have found something that has fixed the issue.
(Minor caveat: I can't be 100% certain it's completely fixed, due the intermittent nature of the issue and my not so great tracking of it, but all available evidence suggests it is now fixed.)
The "fix" was to move the service to Google Cloud Run, which is basically serverless for Docker containers.
Crucially Google Cloud automatically handles setting up the SSL certificate, so there were zero parts for me to screw up. Another advantage is I'm now only paying for the compute time I'm actually using, so it's cheaper.
Apologies to anyone reading this looking for an actual solution to the original problem, but there are a bunch of good things to investigate in the answers and comments below.
NSURLAuthentificationChallenge
? #19507707 – Brice