CoreDNS has problems getting Endpoints, Services, Namespaces

E

2

6

I have a following problem with CoreDNS from master (also see ready is 0/1 on master):

E0321 22:54:45.590231       1 reflector.go:126] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:94: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E0321 22:54:46.528164       1 reflector.go:126] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:94: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E0321 22:54:46.528164       1 reflector.go:126] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:94: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E0321 22:54:46.528164       1 reflector.go:126] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:94: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E0321 22:54:46.528164       1 reflector.go:126] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:94: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E0321 22:54:46.531540       1 reflector.go:126] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:94: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E0321 22:54:46.531540       1 reflector.go:126] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:94: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E0321 22:54:46.531540       1 reflector.go:126] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:94: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E0321 22:54:46.531540       1 reflector.go:126] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:94: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E0321 22:54:46.591304       1 reflector.go:126] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:94: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E0321 22:54:46.591304       1 reflector.go:126] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:94: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E0321 22:54:46.591304       1 reflector.go:126] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:94: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E0321 22:54:46.591304       1 reflector.go:126] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:94: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused

Where everything else seem to be running normally and I can also access internet from nodes/pods on cluster

kube-system           coredns-776474d56-46fnz                        1/1     Running   0          2d23h   10.32.0.3       raspberrypi4-node     <none>           <none>
kube-system           coredns-776474d56-7nlw4                        0/1     Running   0          32h     10.36.0.1       raspberrypi4-master   <none>           <none>
kube-system           etcd-raspberrypi4-master                       1/1     Running   6          3d22h   192.168.0.192   raspberrypi4-master   <none>           <none>
kube-system           kube-apiserver-raspberrypi4-master             1/1     Running   4          3d22h   192.168.0.192   raspberrypi4-master   <none>           <none>
kube-system           kube-controller-manager-raspberrypi4-master    1/1     Running   9          3d22h   192.168.0.192   raspberrypi4-master   <none>           <none>
kube-system           kube-proxy-6vgm9                               1/1     Running   0          3d13h   192.168.0.157   raspberrypi3-node     <none>           <none>
kube-system           kube-proxy-vqqv7                               1/1     Running   5          3d22h   192.168.0.192   raspberrypi4-master   <none>           <none>
kube-system           kube-proxy-wj784                               1/1     Running   0          3d21h   192.168.0.90    raspberrypi4-node     <none>           <none>
kube-system           kube-scheduler-raspberrypi4-master             1/1     Running   9          3d22h   192.168.0.192   raspberrypi4-master   <none>           <none>
kube-system           weave-net-6db56                                2/2     Running   0          3d9h    192.168.0.90    raspberrypi4-node     <none>           <none>
kube-system           weave-net-7t7t6                                2/2     Running   0          3d9h    192.168.0.192   raspberrypi4-master   <none>           <none>
kube-system           weave-net-mg79s                                2/2     Running   0          3d9h    192.168.0.157   raspberrypi3-node     <none>           <none>

I have checked the docs and some ports are not open, but this is access to port 443 which is kinda system privileged port, so I am wondering if this is the case where I need to provide access to kubernetes to that port (and maybe forward it to 6443 which in docs is Kubernetes API server). I will also get access from outside of cluster to this port and would like kubernetes services to handle it and would appreciate a simple command to forward 80 and 443 ports to that port.

I just noticed that service is indeed listening to correct IP/port, so no idea why it is refusing connection.

$ kubectl get svc -A
NAMESPACE     NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
default       kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP                  3d22h
kube-system   kube-dns     ClusterIP   10.96.0.10   <none>        53/UDP,53/TCP,9153/TCP   3d22h

Emphasize answered 20/3, 2020 at 21:49 Comment(2)

add logs from coredns pod – Holbrook 21/3, 2020 at 4:35

The logs are on the top, they are all the same dial tcp 10.96.0.1:443: connect: connection refused – Emphasize 21/3, 2020 at 22:45

S

5

The problem is with iptables.

Make sure the ip forwarding is enabled on the linux kernel of every node. Execute command: $ sysctl net.ipv4.conf.all.forwarding = 1
If your docker's version >=1.13, the default FORWARD chain policy was dropped, you have to set default policy of the FORWARD chain to ACCEPT. Execute command: $ sudo iptables -P FORWARD ACCEPT.
Finally pass kube-proxy configuration using flag cluster-cidr:
--cluster-cidr=.

--cluster-cidr flag means:

CIDR Range for Pods in cluster. Requires --allocate-node-cidrs to be true.

If not provided, no off-cluster bridging will be performed.
Similar problem: kubernetes-coredns-issue.

Please let me know if it helped.

Sweep answered 23/3, 2020 at 12:11 Comment(1)

Hey, what actually worked was that iptables trick (2nd) only in combination with sudo iptables --flush and sudo iptables -tnat --flush while I ran restart of kubelet just before that (as suggested in that issue). – Emphasize 23/3, 2020 at 22:32

T

8

Accepted answer did not solve my problem. In case someone has similar issues restarting coredns solved my issue.

kubectl rollout restart deployment coredns --namespace kube-system

Tuneberg answered 17/12, 2020 at 19:51 Comment(2)

On AWS EKS running nodejs server and was getting getaddrinfo EAI_AGAIN errors due to some kind of kube system DNS timeouts. Restarting the coredns fixed my issue! – Velvetvelveteen 27/1, 2021 at 19:32

In baremetal cluster, this didnt work either. – Guadalquivir 23/11, 2021 at 9:12

S

5