iOS app -- no cellular access to our domain on some devices
Asked Answered
E

1

7

With a React Native app (only tested those generated with create-react-app), some iPhone users are experiencing an issue where the app can almost never make web requests to our API when connected via cellular data. The domain that is having issues points to an Amazon Elastic Load Balancer (Layer 7, SSL termination), which points to an Nginx reverse proxy (inside EKS Kubernetes cluster). Other APIs (e.g. Mapbox) called by the app work fine over cellular data, including one of ours hosted on a dedicated server. The only requests that don't work are those on our ELB domain. When the user switches to WiFi, our app is able to make web requests to that domain. This has been observed on iPhone 7, iPhone 8, and iPhone X, all running iOS 12.3.1. One device is Verizon and the other 5 reported are AT&T. Every API call is HTTPS. Deleting and reinstalling the app and restarting the device does not resolve the issue. We confirmed in all cases that cellular data was enabled for the app in Settings > Cellular > [App name] and in Settings > [App name] > Use Cellular Data.

The app is built with React Native and web requests are performed with the cross-fetch library.

We were able to get a device that has the issue and run it through Xcode. Here is a subset of the error stack captured in Xcode:

nw_connection_copy_connected_local_endpoint [C12] Connection has no local endpoint
2019-06-27 11:26:16.841347-0400 myapp[23700:1527268] [BoringSSL] 
nw_protocol_boringssl_get_output_frames(1301) [C10.1:2][0x117d5a050] get output frames failed, state 8196
2019-06-27 11:26:22.465855-0400 myapp[23700:1527305] [BoringSSL] nw_protocol_boringssl_error(1584) [C20.1:2][0x119b0e420] Lower protocol stack error: 54
2019-06-27 11:26:22.466665-0400 myapp[23700:1527305] TIC TCP Conn Failed [20:0x280022400]: 1:54 Err(54)
2019-06-27 11:26:23.040101-0400 myapp[23700:1527399] Task <DD5FDD4A-1BE0-41ED-AAC4-9EB07F61F109>.<7> HTTP load failed (error code: -1005 [1:54])
2019-06-27 11:26:23.040408-0400 myapp[23700:1527305] Task <DD5FDD4A-1BE0-41ED-AAC4-9EB07F61F109>.<7> finished with error - code: -1005
load failed with error Error Domain=NSURLErrorDomain Code=-1005 "The network connection was lost." UserInfo={_kCFStreamErrorCodeKey=54, NSUnderlyingError=0x283a521f0 {Error Domain=kCFErrorDomainCFNetwork Code=-1005 "(null)" UserInfo={NSErrorPeerAddressKey=<CFData 0x28161ab70 [0x1e9e5d420]>{length = 16, capacity = 16, bytes = 0x100201bb3416ca8a0000000000000000}, _kCFStreamErrorCodeKey=54, _kCFStreamErrorDomainKey=1}}, _NSURLErrorFailingURLSessionTaskErrorKey=LocalDataTask <DD5FDD4A-1BE0-41ED-AAC4-9EB07F61F109>.<7>, _NSURLErrorRelatedURLSessionTaskErrorKey=(
    "LocalDataTask <DD5FDD4A-1BE0-41ED-AAC4-9EB07F61F109>.<7>"
), NSLocalizedDescription=The network connection was lost.

Queries to this particular [ELB] -> [Nginx container] -> [Service containers] setup will occasionally work but then stop. It almost indicates a keep-alive situation like this issue. We had the ELB idle timeout set at its default (60s) and we increased it to 300s with no apparent effect. We tried with the keep-alive for Nginx both set to 360s and to 0s (disabled completely).

For the domain in question we have a mix of services hosted in the Kubernetes cluster, such as Java and Node.js. The issue affects all of them equally.

None of the Android app users have reported this issue.

The devices that experience this issue all do so consistently, it is not intermittent.

Due to the type of error, the requests never reach our Nginx logs.

Elinorelinore answered 21/5, 2019 at 17:54 Comment(7)
What kind of requests is it that are failing? Is it possible some protocols may be blocked by their ISP? Content blocker extensions come to mind as a possible culprit, but that should apply to WiFi as well..Xenos
The requests are HTTPS, to our server. We have not had any issues with the Android devices at all (running the same React Native JS code)..Elinorelinore
Are there any common denominators in what cellular provider they're using? Also, maybe ask on the React Native forums, it might be the javascript doing its thing and making weird untraceable bugs.Xenos
And, what error does the requests fail with?Xenos
Can you provide network logs.Gunplay
We were finally able to capture the error log, see edits above.Elinorelinore
This <developer.apple.com/library/archive/qa/qa1941/_index.html> or this <github.com/AFNetworking/AFNetworking/issues/2801> might help. Seems like it could be something with keep alive and your server.Commend
E
5

Unfortunately, we never found a clear answer to the problem, but we did implement a workaround.

Certain iOS 12.3.1 iPhones on cellular networks seem to have an issue with fact that Amazon's ELB Classic always sends a "Connection: keep-alive" response header. You can change the load balancer's idle timeout, but you cannot set it to 0 (minimum is 1 second). We can reproduce the iOS connection errors by using a new app generated by create-react-app. The requests always work at first and then start to consistently fail.

We fixed the problem by switching from ELB to a Network Load Balancer (AWS NLB). The NLB talks directly to an Nginx ingress controller. Since it's at the TCP level, the NLB layer does not change the headers. The default Nginx controller does not send a "Connection" response header at all. Using this new setup, the iOS app works just fine on all devices.

Elinorelinore answered 3/7, 2019 at 19:31 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.