core_dns stuck in ContainerCreating status [closed]
Asked Answered
S

4

7

I am trying to setup a basic k8s cluster

After doing a kubeadm init --pod-network-cidr=10.244.0.0/16, the coredns pods are stuck in ContainerCreating status

NAME                            READY   STATUS              RESTARTS   AGE
coredns-6955765f44-2cnhj        0/1     ContainerCreating   0          43h
coredns-6955765f44-dnphb        0/1     ContainerCreating   0          43h
etcd-perf1                      1/1     Running             0          43h
kube-apiserver-perf1            1/1     Running             0          43h
kube-controller-manager-perf1   1/1     Running             0          43h
kube-flannel-ds-amd64-smpbk     1/1     Running             0          43h
kube-proxy-6zgvn                1/1     Running             0          43h
kube-scheduler-perf1            1/1     Running             0          43h

OS-IMAGE: Ubuntu 16.04.6 LTS KERNEL-VERSION: 4.4.0-142-generic CONTAINER-RUNTIME: docker://19.3.5

Errors from journalctl -xeu kubelet command

Jan 02 10:31:44 perf1 kubelet[11901]: 2020-01-02 10:31:44.112 [INFO][10207] k8s.go 228: Using Calico IPAM
Jan 02 10:31:44 perf1 kubelet[11901]: E0102 10:31:44.118281   11901 cni.go:385] Error deleting kube-system_coredns-6955765f44-2cnhj/12cd9435dc905c026bbdb4a1954fc36c82ede1d703b040a3052ab3370445abbf from
Jan 02 10:31:44 perf1 kubelet[11901]: E0102 10:31:44.118828   11901 remote_runtime.go:128] StopPodSandbox "12cd9435dc905c026bbdb4a1954fc36c82ede1d703b040a3052ab3370445abbf" from runtime service failed:
Jan 02 10:31:44 perf1 kubelet[11901]: E0102 10:31:44.118872   11901 kuberuntime_manager.go:898] Failed to stop sandbox {"docker" "12cd9435dc905c026bbdb4a1954fc36c82ede1d703b040a3052ab3370445abbf"}
Jan 02 10:31:44 perf1 kubelet[11901]: E0102 10:31:44.118917   11901 kuberuntime_manager.go:676] killPodWithSyncResult failed: failed to "KillPodSandbox" for "e44bc42f-0b8d-40ad-82a9-334a1b1c8e40" with
Jan 02 10:31:44 perf1 kubelet[11901]: E0102 10:31:44.118939   11901 pod_workers.go:191] Error syncing pod e44bc42f-0b8d-40ad-82a9-334a1b1c8e40 ("coredns-6955765f44-2cnhj_kube-system(e44bc42f-0b8d-40ad-
Jan 02 10:31:47 perf1 kubelet[11901]: W0102 10:31:47.081709   11901 cni.go:331] CNI failed to retrieve network namespace path: cannot find network namespace for the terminated container "747c3cc9455a7d
Jan 02 10:31:47 perf1 kubelet[11901]: 2020-01-02 10:31:47.113 [INFO][10267] k8s.go 228: Using Calico IPAM
Jan 02 10:31:47 perf1 kubelet[11901]: E0102 10:31:47.118526   11901 cni.go:385] Error deleting kube-system_coredns-6955765f44-dnphb/747c3cc9455a7db202ab14576d15509d8ef6967c6349e9acbeff2207914d3d53 from
Jan 02 10:31:47 perf1 kubelet[11901]: E0102 10:31:47.119017   11901 remote_runtime.go:128] StopPodSandbox "747c3cc9455a7db202ab14576d15509d8ef6967c6349e9acbeff2207914d3d53" from runtime service failed:
Jan 02 10:31:47 perf1 kubelet[11901]: E0102 10:31:47.119052   11901 kuberuntime_manager.go:898] Failed to stop sandbox {"docker" "747c3cc9455a7db202ab14576d15509d8ef6967c6349e9acbeff2207914d3d53"}
Jan 02 10:31:47 perf1 kubelet[11901]: E0102 10:31:47.119098   11901 kuberuntime_manager.go:676] killPodWithSyncResult failed: failed to "KillPodSandbox" for "52ffb25e-06c7-4cc6-be70-540049a6be20" with
Jan 02 10:31:47 perf1 kubelet[11901]: E0102 10:31:47.119119   11901 pod_workers.go:191] Error syncing pod 52ffb25e-06c7-4cc6-be70-540049a6be20 ("coredns-6955765f44-dnphb_kube-system(52ffb25e-06c7-4cc6-

I have tried kubdeadm reset as well but no luck so far

Scrivener answered 2/1, 2020 at 5:22 Comment(1)
Have u deployed a cni plugin such as calico or weave?Neighborly
S
8

Looks like the issue was because I tried switching from calico to flannel cni. Following the steps mentioned here has resolved the issue for me

Pods failed to start after switch cni plugin from flannel to calico and then flannel

Additionally you may have to clear the contents of /etc/cni/net.d

Scrivener answered 2/1, 2020 at 6:8 Comment(1)
God bless you, you're my life saverTurntable
N
2

CoreDNS will not start up before a CNI network is installed.

For flannel to work correctly, you must pass --pod-network-cidr=10.244.0.0/16 to kubeadm init. Set /proc/sys/net/bridge/bridge-nf-call-iptables to 1 by running sysctl net.bridge.bridge-nf-call-iptables=1 to pass bridged IPv4 traffic to iptables’ chains. This is a requirement for some CNI plugins to work. Make sure that your firewall rules allow UDP ports 8285 and 8472 traffic for all hosts participating in the overlay network. see here . Note that flannel works on amd64, arm, arm64, ppc64le and s390x under Linux. Windows (amd64) is claimed as supported in v0.11.0 but the usage is undocumented

To deploy flannel as CNI network

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/2140ac876ef134e0ed5af15c65e414cf26827915/Documentation/kube-flannel.yml

After you have deployed flannel delete the core dns pods, Kubernetes will recreate the pods.

Neighborly answered 2/1, 2020 at 5:40 Comment(0)
J
1

You have deployed flannel as CNI but the logs from kubelet shows that kubernetes is using calico.

[INFO][10207] k8s.go 228: Using Calico IPAM

Something wrong with container network. without that coredns doesnt succeed. You might have to reinstall with correct CNI. Once CNI is deployed successfully, coreDNS gets deployed automatically

Joceline answered 2/1, 2020 at 6:6 Comment(1)
Yes, I tried to install calico first and then switched to flannel. I always suspected that calico was not fully removed. I have posted the answer below nowScrivener
C
1

So here is my solution:

  • First, coreDNS will run on your [Master / Control-Plane] Nodes
  • Now let's run ifconfig to check for these 2 interfaces cni0 and flannel.1
  • Suppose cni0=10.244.1.1 & flannel.1=10.244.0.0 then your DNS will not be created
  • It should be cni0=10.244.0.1 & flannel.1=10.244.0.0. Which mean cni0 must follow flannel.1/24 patterns

Run the following 2 command to Down Interface and Remove it from your Master/Control-Plane Machines

sudo ifconfig cni0 down;
sudo ip link delete cni0;

Now check via ifconfig you will see 2 more vethxxxxxxxx Interface appears. This should fixed your problem.

Cappadocia answered 26/6, 2022 at 4:2 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.