kubectl drain and rolling update, downtime
Asked Answered
B

2

8

Does kubectl drain first make sure that pods with replicas=1 are healthy on some other node?
Assuming the pod is controlled by a deployment, and the pods can indeed be moved to other nodes. Currently as I see it only evict (delete pods) from the nodes, without scheduling them first.

Bubonocele answered 15/12, 2019 at 11:6 Comment(0)
H
11

In addition to Suresh Vishnoi answer:

If PodDisruptionBudget is not specified and you have a deployment with one replica, the pod will be terminated and then new pod will be scheduled on a new node.

To make sure your application will be available during node draining process you have to specify PodDisruptionBudget and create more replicas. If you have 1 pod with minAvailable: 30% it will refuse to drain with following error:

error when evicting pod "pod01" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.

Briefly that's how draining process works:

As explained in documentation kubectl drain command "safely evicts all of your pods from a node before you perform maintenance on the node and allows the pod’s containers to gracefully terminate and will respect the PodDisruptionBudgets you have specified”

Drain does two things:

  1. cordons the node- it means that node is marked as unschedulable, so new pods cannot be scheduled on this node. Makes sense- if we know that node will be under maintenance there is no point to schedule a pod there and then reschedule it on another node due to maintenance. From Kubernetes perspective it adds a taint to the node: node.kubernetes.io/unschedulable:NoSchedule

  2. evicts/ deletes the pods- after node is marked as unschedulable it tries to evict the pods that are running on the node. It uses Eviction API which takes PodDisruptionBudgets into account (if it's not supported it will delete pods). It calls DELETE method to K8S but considers GracePeriodSeconds so it lets a pod finish it's processes.

Hatcher answered 17/12, 2019 at 11:46 Comment(6)
So as I see drain is not always a safe command if the goal is to avoid downtime. Can k8s just do normal rolling update when node is drained, as if I would apply a new deployment? Can I specify a default PDB for a namespace? Or in short how to use drain in a safe way if I have some lone pods?Bubonocele
It is safe if you have more than one replica so it can terminate pod on drained node and schedule it to another to achieve desired state. If you want to keep only one replica do not specify PDB and tolerate occasional downtime (kubernetes.io/docs/tasks/run-application/configure-pdb/…)Hatcher
And if I have two replicas and they both run on the same node which is asked to be drained?Bubonocele
Then it will terminate and reschedule one by one to guarantee availability.Hatcher
So in other words, if a deployment has more than 1 replica - the drain process will act in the same way as the 'rollingUpdate' process ? If the deployment has 1 replica, then downtime might be possible. Is that correct ?Galactopoietic
@Hatcher If I have a deployment with just 1 replica and the replacement pod during the drain of the node didn't come up successfully will the drain fail?Gilbertegilbertian
T
5

New Pods are scheduled, when the number of pods are not available (desired state != current state) in respective of draining or node failure.

With the PodDisruptionBudget resource you can manage the disruption during the draining of the node.

You can specify only one of maxUnavailable and minAvailable in a single PodDisruptionBudget. maxUnavailable can only be used to control the eviction of pods that have an associated controller managing them. In the examples below, “desired replicas” is the scale of the controller managing the pods being selected by the PodDisruptionBudget. https://kubernetes.io/docs/tasks/run-application/configure-pdb/#specifying-a-poddisruptionbudget

Example 1: With a minAvailable of 5, evictions are allowed as long as they leave behind 5 or more healthy pods among those selected by the PodDisruptionBudget’s selector.

Example 2: With a minAvailable of 30%, evictions are allowed as long as at least 30% of the number of desired replicas are healthy.

Example 3: With a maxUnavailable of 5, evictions are allowed as long as there are at most 5 unhealthy replicas among the total number of desired replicas.

Example 4: With a maxUnavailable of 30%, evictions are allowed as long as no more than 30% of the desired replicas are unhealthy.

Throughout answered 15/12, 2019 at 11:24 Comment(8)
how 30% minAvailable works when you have only one pod?Bubonocele
It will refuse to drain.Same
Can kubernetes schedule new pods before it drains the new node? (like in a normal rolling update). Currently I have maxUnavailable: 25% maxSurge: 25%, and I do see downtime when using kubectl drain --ignore-daemonsets Bubonocele
You would have to run more than one replica. The drain process won’t surge like a deployment will.Same
usually, the replicas will be running on different nodes. so draining of one node, will not sum up downtime. you can configure specific scheduler policies for replications on AZ, Region, addition to this you can use anti-afinity among replicas.Throughout
what should be ideal way now if we have a default rolling update 25% and running single replicas of many services. Do we have to run multiple replicas?Palestine
is there any way we can or work around ? i cant run multiple replica of service due to some limitation of application now if i will set PDB it will create issue.Palestine
It's necessary to run multiple replicas in order to utilize PWD. You need to run minAvailable + 1 pod at least, to work PodDisruptionBudget correctly.Throughout

© 2022 - 2024 — McMap. All rights reserved.