I have been trying to debug a very odd delay in my K8S deployments. I have tracked it down to the simple reproduction below. What it appears is that if I set an initialDelaySeconds on a startup probe or leave it 0 and have a single failure, then the probe doesn't get run again for a while and ends up with atleast a 1-1.5 minute delay getting into Ready:true state.
I am running locally with Ubutunu 18.04 and microk8s v1.19.3 with the following versions:
- kubelet: v1.19.3-34+a56971609ff35a
- kube-proxy: v1.19.3-34+a56971609ff35a
- containerd://1.3.7
apiVersion: apps/v1
kind: Deployment
app: microbot
name: microbot
replicas: 1
app: microbot
strategy: {}
app: microbot
- image: cdkbot/microbot-amd64
name: microbot
command: ["/bin/sh"]
args: ["-c", "sleep 3; /start_nginx.sh"]
#args: ["-c", "/start_nginx.sh"]
- containerPort: 80
path: /
port: 80
initialDelaySeconds: 0 # 5 also has same issue
periodSeconds: 1
failureThreshold: 10
successThreshold: 1
## httpGet:
## path: /
## port: 80
## initialDelaySeconds: 0
## periodSeconds: 10
## failureThreshold: 1
resources: {}
restartPolicy: Always
serviceAccountName: ""
status: {}
apiVersion: v1
kind: Service
name: microbot
app: microbot
- port: 80
protocol: TCP
targetPort: 80
app: microbot
The issue is that if I have any delay in the startupProbe or if there is an initial failure, the pod gets into Initialized:true state but had Ready:False and ContainersReady:False. It will not change from this state for 1-1.5 minutes. I haven't found a pattern to the settings.
I left in the comment out settings as well so you can see what I am trying to get to here. What I have is a container starting up that has a service that will take a few seconds to get started. I want to tell the startupProbe to wait a little bit and then check every second to see if we are ready to go. The configuration seems to work, but there is a baked in delay that I can't track down. Even after the startup probe is passing, it does not transition the pod to Ready for more than a minute.
Is there some setting elsewhere in k8s that is delaying the amount of time before a Pod can move into Ready if it isn't Ready initially?
Any ideas are greatly appreciated.
, so I would suggest to delete it, then configurefailureThreshold
with higher values, startup probe usefailureThreshold * periodSeconds
, so with your configuration that's 10s, might be not enough for your application. Could you try to increase it, for example tofailureThreshold: 30 periodSeconds: 10
and check it again? – Photomultiplier