Kubernetes CoreDNS in CrashLoopBackOff
Asked Answered
R

5

9

I understand that this question is asked dozen times, but nothing has helped me through internet searching.

My set up:

CentOS Linux release 7.5.1804 (Core)
Docker Version: 18.06.1-ce
Kubernetes: v1.12.3

Installed by official guide and this one:https://www.techrepublic.com/article/how-to-install-a-kubernetes-cluster-on-centos-7/

CoreDNS pods are in Error/CrashLoopBackOff state.

kube-system   coredns-576cbf47c7-8phwt                 0/1     CrashLoopBackOff   8          31m
kube-system   coredns-576cbf47c7-rn2qc                 0/1     CrashLoopBackOff   8          31m

My /etc/resolv.conf:

nameserver 8.8.8.8

Also tried with my local dns-resolver(router)

nameserver 10.10.10.1

Setup and init:

kubeadm init --apiserver-advertise-address=10.10.10.3 --pod-network-cidr=192.168.1.0/16
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

I tried to solve this with: Editing the coredns: root@kub~]# kubectl edit cm coredns -n kube-system and changing

proxy . /etc/resolv.conf

directly to

proxy . 10.10.10.1

or proxy . 8.8.8.8

Also tried to:

kubectl -n kube-system get deployment coredns -o yaml |   sed 's/allowPrivilegeEscalation: false/allowPrivilegeEscalation: true/g' |   kubectl apply -f -

And still nothing helps me.

Error from the logs:

plugin/loop: Seen "HINFO IN 7847735572277573283.2952120668710018229." more than twice, loop detected

The other thread - coredns pods have CrashLoopBackOff or Error state didnt help at all, becouse i havent hit any solutions that were described there. Nothing helped.

Rhoades answered 30/11, 2018 at 14:19 Comment(4)
Can you post the logs? You can get them through kubectl logs coredns-576cbf47c7-8phwt for exampleGreenling
Updated, yep, sorry, forgot to add the actual error.Rhoades
Possible duplicate of coredns pods have CrashLoopBackOff or Error stateKotta
I didnt hit any of the presented solutions except a "hacky" - removing the loop plugin.Rhoades
E
14

Even I have got such error and I successfully managed to work by below steps.

However, you missed 8.8.4.4

sudo nano /etc/resolv.conf

nameserver 8.8.8.8
nameserver 8.8.4.4

run following commands to restart daemon and docker service

sudo systemctl daemon-reload

sudo systemctl restart docker

If you are using kubeadm make sure you delete an entire cluster from master and provision cluster again.

kubectl drain <node_name> --delete-local-data --force --ignore-daemonsets
kubectl delete node <node_name>
kubeadm reset

Once You Provision the new cluster

kubectl get pods --all-namespaces

It Should give below expected Result

NAMESPACE     NAME                       READY   STATUS    RESTARTS   AGE
kube-system   calico-node-gldlr          2/2     Running   0          24s
kube-system   coredns-86c58d9df4-lpnj6   1/1     Running   0          40s
kube-system   coredns-86c58d9df4-xnb5r   1/1     Running   0          40s
kube-system   kube-proxy-kkb7b           1/1     Running   0          40s
kube-system   kube-scheduler-osboxes     1/1     Running   0          10s
Esteresterase answered 26/3, 2019 at 7:1 Comment(7)
I resolve just with: nameserver 8.8.8.8 nameserver 8.8.4.4 sudo systemctl daemon-reload sudo systemctl restart dockerDicho
Hey @Altieres de Matos Even I have mentioned same commands man!!Esteresterase
@NarendranathReddy GREAT! Worked for me.Dewittdewlap
@SahilGulati Enjoy buddy, Happy Kubernetes!!Esteresterase
For me just sudo systemctl daemon-reload and sudo systemctl restart docker worked. Even though coreDNS stays in loopback, one of the service IPs that wasn't reachable started responding again.Alanson
Why do you have to restart the entire cluster just to get the coredns pods back up running ?Shrew
I just found a working solution here , github.com/coredns/coredns/issues/2087#issuecomment-432387727 tried and tested.Shrew
O
4

$kubectl edit cm coredns -n kube-system delete ‘loop’ ,save and exit restart master node. It was work for me.

Orion answered 2/12, 2018 at 18:9 Comment(2)
This means coredns can't detect loops -- I did this and it worked (coredns could run and hostnames could resolve), but coredns had very high CPU utilizationLangrage
@Hk Can you please put your answer in understandable format?Gulfweed
F
3

I faced the the same issue in my local k8s in Docker (KIND) setup. CoreDns pod gets crashloop backoff error.

Steps followed to make the pod into running state:

As Tim Chan said in this post and by referring the github issues link, I did the following

  1. kubectl -n kube-system edit configmaps coredns -o yaml
  2. modify the section forward . /etc/resolv.conf with forward . 172.16.232.1 (mycase i set 8.8.8.8 for the timebeing)
  3. Delete one of the Coredns Pods, or can wait for sometime - the pods will be in running state.
Fann answered 1/5, 2021 at 10:22 Comment(0)
G
1

Usually happens when coredns can't talk to the kube-apiserver:

Check that your kubernetes service is in the default namespace:

$ kubectl get svc kubernetes
NAME         TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
kubernetes   ClusterIP   10.96.0.1      <none>        443/TCP        130d

Then (you might have to create a pod):

$ kubectl -n kube-system exec -it <any-pod-with-shell> sh
# ping kubernetes.default.svc.cluster.local
PING kubernetes.default.svc.cluster.local (10.96.0.1): 56 data bytes

Also, try hitting port 443 from the port:

# telnet kubernetes.default.svc.cluster.local 443 # or
# curl kubernetes.default.svc.cluster.local:443
Gonsalez answered 1/12, 2018 at 1:51 Comment(0)
D
1

I got the error is:

connect: no route to host","time":"2021-03-19T14:42:05Z"} crashloopbackoff

in the log showed by kubectl -n kube-system logs coredns-d9fdb9c9f-864rz

The issue is mentioned in https://github.com/coredns/coredns/tree/master/plugin/loop#troubleshooting-loops-in-kubernetes-clusters

tldr; Reason: /etc/resolv.conf got updated somehow. The original one is at /run/systemd/resolve/resolv.conf: e.g:

nameserver 172.16.232.1

Quick fix, edit Corefile:

$ kubectl -n kube-system edit configmaps coredns -o yaml

to replace forward . /etc/resolv.conf with forward . 172.16.232.1 e.g:

apiVersion: v1
data:
  Corefile: |
    .:53 {
        errors
        health {
           lameduck 5s
        }
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
           pods insecure
           fallthrough in-addr.arpa ip6.arpa
           ttl 30
        }
        prometheus :9153
        forward . 172.16.232.1 {
           max_concurrent 1000
        }
        cache 30
        loop
        reload
        loadbalance
    }
kind: ConfigMap
metadata:
  creationTimestamp: "2021-03-18T15:58:07Z"
  name: coredns
  namespace: kube-system
  resourceVersion: "49996"
  uid: 428a03ff-82d0-4812-a3fa-e913c2911ebd

Done, after that, may need to restart the docker

sudo systemctl restart docker

Update: it could be fixed by just sudo systemctl restart docker

Dejong answered 19/3, 2021 at 15:58 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.