How to remove a master node from a HA cluster and also from etcd cluster
Asked Answered
C

1

6

I am new to k8s and I found a problem that I can not resolve.

I am building a HA cluster of Master nodes. I am running some tests (removing one node and adding the node again). Through this process I noticed that the etcd cluster does not update the cluster list.

Sample of problem below:

$ kubectl get pods -A
NAMESPACE                NAME                                                 READY   STATUS    RESTARTS   AGE
cri-o-metrics-exporter   cri-o-metrics-exporter-77c9cf9746-qlp4d              0/1     Pending   0          16h
haproxy-controller       haproxy-ingress-769d858699-b8r8q                     0/1     Pending   0          16h
haproxy-controller       ingress-default-backend-5fd4986454-kvbw8             0/1     Pending   0          16h
kube-system              calico-kube-controllers-574d679d8c-tkcjj             1/1     Running   3          16h
kube-system              calico-node-95t6l                                    1/1     Running   2          16h
kube-system              calico-node-m5txs                                    1/1     Running   2          16h
kube-system              coredns-7588b55795-gkfjq                             1/1     Running   2          16h
kube-system              coredns-7588b55795-lxpmj                             1/1     Running   2          16h
kube-system              etcd-masterNode1                                     1/1     Running   2          16h
kube-system              etcd-masterNode2                                     1/1     Running   2          16h
kube-system              kube-apiserver-masterNode1                           1/1     Running   3          16h
kube-system              kube-apiserver-masterNode2                           1/1     Running   3          16h
kube-system              kube-controller-manager-masterNode1                  1/1     Running   4          16h
kube-system              kube-controller-manager-masterNode2                  1/1     Running   4          16h
kube-system              kube-proxy-5q6xs                                     1/1     Running   2          16h
kube-system              kube-proxy-k8p6h                                     1/1     Running   2          16h
kube-system              kube-scheduler-masterNode1                           1/1     Running   3          16h
kube-system              kube-scheduler-masterNode2                           1/1     Running   6          16h
kube-system              metrics-server-575bd7f776-jtfsh                      0/1     Pending   0          16h
kubernetes-dashboard     dashboard-metrics-scraper-6f78bc588b-khjjr           1/1     Running   2          16h
kubernetes-dashboard     kubernetes-dashboard-978555c5b-9jsxb                 1/1     Running   2          16h
$ kubectl exec etcd-masterNode2 -n kube-system -it -- sh
sh-5.0# etcdctl --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key member list -w table
+------------------+---------+----------------------------+---------------------------+---------------------------+------------+
|        ID        | STATUS  |            NAME            |        PEER ADDRS         |       CLIENT ADDRS        | IS LEARNER |
+------------------+---------+----------------------------+---------------------------+---------------------------+------------+
| 4c209e5bc1ca9593 | started |         masterNode1        |     https://IP1:2380      |     https://IP1:2379      |      false |
| 676d4bfab319fa22 | started |         masterNode2        |     https://IP2:2380      |     https://IP2:2379      |      false |
| a9af4b00e33f87d4 | started |         masterNode3        |     https://IP3:2380      |     https://IP3:2379      |      false |
+------------------+---------+----------------------------+---------------------------+---------------------------+------------+
sh-5.0# exit
$ kubectl get nodes
NAME                         STATUS   ROLES    AGE   VERSION
masterNode1                  Ready    master   16h   v1.19.0
masterNode2                  Ready    master   16h   v1.19.0

I assume that I am removing correctly the node from the cluster. The procedure that I am following:

  1. kubectl drain --ignore-daemonsets --delete-local-data
  2. kubectl delete
  3. node kubeadm reset
  4. rm -f /etc/cni/net.d/* # Removing CNI configuration
  5. rm -rf /var/lib/kubelet # Removing /var/lib/kubeler dir
  6. rm -rf /var/lib/etcd # Removing /var/lib/etcd
  7. iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X && iptables -t filter -F && iptables -t filter -X # Removing iptables
  8. ipvsadm --clear
  9. rm -rf /etc/kubernetes # Removing /etc/kubernetes (in case of character change)

I am running kubernetes with version 1.19.0 and etcd etcd:3.4.9-1.

The cluster is running on bare metal nodes.

Is this a bug or I am not removing the node correctly from the etcd cluster?

Crustaceous answered 7/10, 2020 at 10:16 Comment(2)
You can use etcdctl member remove to remove the node from etcd. Is it not working?Escuage
I could try this, but I was hopping that etcd will automatically keep track of nodes joining / leaving the cluster. I do not want to minimize as much as possible human interaction.Crustaceous
C
15

Thanks to Mariusz K. I found the answer to my problem. In case that someone else might have the same problem here is how I solved it.

First query the cluster (HA) for the etcd members (sample of code):

$ kubectl exec etcd-< nodeNameMasterNode > -n kube-system -- etcdctl --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key member list
1863b58e85c8a808, started, nodeNameMaster1, https://IP1:2380, https://IP1:2379, false
676d4bfab319fa22, started, nodeNameMaster2, https://IP2:2380, https://IP2:2379, false
b0c50c50d563ed51, started, nodeNameMaster3, https://IP3:2380, https://IP3:2379, false

Then once you have the list of nodes you can remove any member you want. Sample of code:

kubectl exec etcd-nodeNameMaster1 -n kube-system -- etcdctl --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key member remove b0c50c50d563ed51
Member b0c50c50d563ed51 removed from cluster d1e1de99e3d19634

I wanted to be able to remove a member from the etcd cluster without the need to connect to the pod and run a secondary command. This way I execute the command to the pod through exec.

Crustaceous answered 12/10, 2020 at 17:10 Comment(2)
If you want to remove from etcd only then you can use following kubeadm reset phase remove-etcd-memberKnopp
@Knopp it appears that this would only work if the node in question is still up and running. If it has failed and not available, what's said in the answer seems like the way to go.Plasmasol

© 2022 - 2024 — McMap. All rights reserved.