How to detect GKE autoupgrading a node in Stackdriver logs
Asked Answered
G

4

9

We have a GKE cluster with auto-upgrading nodes. We recently noticed a node become unschedulable and eventually deleted that we suspect was being upgraded automatically for us. Is there a way to confirm (or otherwise) in Stackdriver that this was indeed the cause what was happening?

Garry answered 24/6, 2019 at 15:34 Comment(3)
not sure, but it should be doing a cordon and drain. In which case the kubelet would produce the below if stackdriver is scraping that. kubelet[1319]: I0624 18:41:04.771532 1319 kubelet_node_status.go:447] Recording NodeNotSchedulable event message for node gke-squareroute-default-pool-9f095a99-s6z9Cotenant
@Cotenant Thanks -- yes, we do get logs with NodeNotSchedulable popping up at that time. I guess that's not entirely sufficient to know that it was caused by the automatic node upgrade (e.g. someone could be doing it manually), but it gets us a good way towards it.Garry
Normally you should see the upgrade as a node pool operation but there is currently an issue where the logs are not being created during this operation. They still appear if you manually upgrade the node poolMesomorph
K
16

You can use the following advanced logs queries with Cloud Logging (previously Stackdriver) to detect upgrades to node pools:

protoPayload.methodName="google.container.internal.ClusterManagerInternal.UpdateClusterInternal"
resource.type="gke_nodepool"

and master:

protoPayload.methodName="google.container.internal.ClusterManagerInternal.UpdateClusterInternal"
resource.type="gke_cluster"

Additionally, you can control when the update are applied with Maintenance Windows (like the user aurelius mentioned).

Kermitkermy answered 30/4, 2021 at 20:42 Comment(1)
Thank you. Could you query for termination of a particular node being due to this upgrade event?Photometer
W
1

I know it's not Cloud Logging, but another method to list the auto-upgrade operations is with gcloud. In Cloud Logging I could only find the completion of the upgrade, not the start.

gcloud container operations list
Woodbine answered 25/7, 2023 at 16:37 Comment(0)
A
0

Just to add more value to the post, you may find different GKE related sample log queries here

Anaya answered 19/3 at 9:15 Comment(0)
E
-1

I think your question has been already answered in the comments. Just as addition automatic upgrades occur at regular intervals at the discretion of the GKE team. To get more control you can create a Maintenance Windows as explained here. This is basically a time frame that you choose in which automatic upgrades should occur.

Essentiality answered 26/6, 2019 at 14:7 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.