NetworkPlugin cni failed to set up pod "xxxxx" network: failed to set bridge addr: "cni0" already has an IP address different from10.x.x.x - Error

Asked 22/4, 2020 at 19:17 Answered 23/4, 2022 at 15:24

amazon-web-services kubernetes cloud-foundry flannel pivotaltracker

I get this error after I start the worker node VMs(Kubernetes) from the AWS console. I am using PKS ( Pivotal Container Service)

network for pod "xxxxx": NetworkPlugin cni failed to set up pod "xxxxx" network: failed to set bridge addr: "cni0" already has an IP address different from 10.x.x.x/xx

I supppose that Flannel assigns a subnet lease to the workers in the cluster, which expires after 24 hours - and flannel.1 and cni0 /24 subnet no longer match, which causes this issue.

I also know a workaround:

bosh ssh -d worker -c "sudo /var/vcap/bosh/bin/monit stop flanneld" 
bosh ssh -d worker -c "sudo rm /var/vcap/store/docker/docker/network/files/local-kv.db" 
bosh ssh -d worker -c "sudo /var/vcap/bosh/bin/monit restart all"

However is there any permanent fix to this?

Dashtilut answered 22/4, 2020 at 19:17 Comment(3)

Hi, it seems like You need to reinitialize the pod network every time the node starts up. Maybe the --pod-network-cidr value is incorrect. Are these worker nodes the same versions as master nodes? Can You share k8s versions of nodes and flannel cni version? – Outoftheway 23/4, 2020 at 14:28

Yup, they are the same version, – Dashtilut 24/4, 2020 at 7:40

I am using K8s Version 1.15.5,i see the cni0 and flannel have ips that are not from the same subnet My pod network cidr is 10.200.0.0/16 On one of the worker nodes, ifconfig ouptut is like this: cni0 Link encap:Ethernet inet addr:10.200.28.1 Bcast:10.200.28.255 Mask:255.255.255.0 flannel.1 Link encap:Ethernet HWaddr inet addr:10.200.42.0 Bcast:0.0.0.0 Mask:255.255.255.255 Subnet.env has this: FLANNEL_NETWORK=10.200.0.0/16 FLANNEL_SUBNET=10.200.42.1/24 FLANNEL_MTU=8951 FLANNEL_IPMASQ=true – Dashtilut 24/4, 2020 at 8:2

TL;DR - recreate network

$ ip link set cni0 down
$ brctl delbr cni0

Or, as @ws_ suggested in comments - remove interfaces and restart k8s services:

ip link set cni0 down && ip link set flannel.1 down 
ip link delete cni0 && ip link delete flannel.1
systemctl restart containerd && systemctl restart kubelet

Community solutions

It is a known issue

And there are some solutions to fix it.

Solution by filipenv is:

on master and slaves:
$ kubeadm reset
$ systemctl stop kubelet
$ systemctl stop docker
$ rm -rf /var/lib/cni/
$ rm -rf /var/lib/kubelet/*
$ rm -rf /etc/cni/
$ ifconfig cni0 down
$ ifconfig flannel.1 down
$ ifconfig docker0 down
you may need to manually umount filesystems from /var/lib/kubelet before calling rm on that dir) After doing that I started docker and kubelet back again and restarted the kubeadm process

aysark: and kubernetes-handbook in a recipe for Pod stuck in Waiting or ContainerCreating both recommend

$ ip link set cni0 down
$ brctl delbr cni0

Some workarounds from Flannel's KB article

And there is an article in Flannel's KB: PKS Flannel network gets out of sync with docker bridge network (cni0)

Workaround 1:

WA1 is just like yours:

    bosh ssh -d <deployment_name> worker -c "sudo /var/vcap/bosh/bin/monit stop flanneld"
    bosh ssh -d <deployment_name> worker -c "sudo rm /var/vcap/store/docker/docker/network/files/local-kv.db"
    bosh ssh -d <deployment_name> worker -c "sudo /var/vcap/bosh/bin/monit restart all"

Workaround 2:

If WA1 didn't help, KB recommends:

    bosh ssh -d <deployment_name> worker -c "sudo /var/vcap/bosh/bin/monit stop flanneld"
    bosh ssh -d <> worker -c "ifconfig | grep -A 1 flannel"
    On a master node get access to etcd using the following KB 
    On a master node run `etcdctlv2 ls /coreos.com/network/subnets/`
    Remove all the worker subnet leases from etcd by running `etcdctlv2 rm /coreos.com/network/subnets/<worker_subnet>;` for each of the worker subnets from point 2 above.
    bosh ssh -d <deployment_name> worker -c "sudo /var/vcap/bosh/bin/monit restart flanneld"

Marine answered 1/5, 2020 at 14:14 Comment(5)

this works for me (k8s 1.21 with containerd): 1. ip link set cni0 down && ip link set flannel.1 down 2. ip link delete cni0 && ip link delete flannel.1 3. systemctl restart containerd && systemctl restart kubelet – Malina 14/8, 2021 at 9:26

Your comment worked for me as well. Thanks @Malina – Humane 26/10, 2021 at 10:24

You saved me dude. This is working perfectly in 1.23.x – Undone 20/2, 2022 at 8:4

It work also for me but i have to remove flannels pod running on this node – Shapeless 8/2 at 16:36

I used this in 1.29 and removing those interfaces worked for me. Didn't need to restart crio or kubelet. – Gosnell 5/7 at 15:59

I am running docker with kubernetes. Did the following on all my master and slave nodes and got my cluster work:

sudo su
ip link set cni0 down && ip link set flannel.1 down 
ip link delete cni0 && ip link delete flannel.1
systemctl restart docker && systemctl restart kubelet

Prolocutor answered 23/4, 2022 at 15:24 Comment(0)

TL;DR - recreate network

Community solutions

Some workarounds from Flannel's KB article

Workaround 1:

Workaround 2:

Recommended topics

Hot tags