We're running a web API app on Kubernetes (1.9.3) in AWS (set with KOPS). The app is a Deployment and represented by a Service (type: LoadBalancer) which is actually an ELB (v1) on AWS. This generally works - except that some packets (fragments of HTTP requests) are "delayed" somewhere between the client <-> app container. (In both HTTP and HTTPS which terminates on ELB).
From the node side:
( Note: Almost all packets on server-side arrive duplicated 3 times )
We use keep-alive so the tcp socket is open and requests arrive and return pretty fast. Then the problem happens:
- first, a packet with only the headers arrives [PSH,ACK] (I see the headers in the payload with tcpdump).
- an [ACK] is sent back by the container.
- The tcp socket/stream is quiet for a very long time (up to 30s and more - but the interval is not consistent, we consider >1s as a problem ).
- another [PSH, ACK] with the HTTP data arrives, and the request can finally be processed in the app.
From the client side:
I've run some traffic from my computer, recording it on the client side to see the other end of the problem, but not 100% sure it represents the real client side.
- a [PSH,ASK] with the headers go out.
- a couple of [ACK]s with parts of the payload start going out.
- no response arrives for a few seconds (or more) and no more packets go out.
- [ACK] marked as [TCP Window update] arrives.
- a short pause again and [ACK]s start arriving and the session continues until the end of the payload.
This is only happening under load.
To my understanding, this is somewhere between the ELB and the Kube-Proxy, but I'm clueless and desperate for help.
This is the arguments Kube-Proxy runs with:
Commands: /bin/sh -c mkfifo /tmp/pipe; (tee -a /var/log/kube-proxy.log < /tmp/pipe & ) ; exec /usr/local/bin/kube-proxy --cluster-cidr=100.96.0.0/11 --conntrack-max-per-core=131072 --hostname-override=ip-10-176-111-91.ec2.internal --kubeconfig=/var/lib/kube-proxy/kubeconfig --master=https://api.internal.prd.k8s.local --oom-score-adj=-998 --resource-container="" --v=2 > /tmp/pipe 2>&1
And we use Calico as a CNI:
So far I've tried:
- Using
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
- the issue remained. - (Playing around with ELB settings hoping something will do the trick ¯_(ツ)_/¯ )
- Looking for errors in the Kube-Proxy, found rare occurrences of the following:
E0801 04:10:57.269475 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:85: Failed to list *core.Endpoints: Get https://api.internal.prd.k8s.local/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp: lookup api.internal.prd.k8s.local on 10.176.0.2:53: no such host
...and...
E0801 04:09:48.075452 1 proxier.go:1667] Failed to execute iptables-restore: exit status 1 (iptables-restore: line 7 failed ) I0801 04:09:48.075496 1 proxier.go:1669] Closing local ports after iptables-restore failure
I couldn't find anything describing such issue and will appreciate any help. Ideas on how to continue and troubleshoot are welcome.
nodePort
to check is it problem with balancer or kube-proxy? – Minnesota