`Error 403: Insufficient regional quota to satisfy request: resource "SSD_TOTAL_GB"` when creating kubernetes cluster with terraform
Asked Answered
M

10

9

Hi I am playing around with kubernetes and terraform in a google cloud free tier account (trying to use the free 300$). Here is my terraform resource declaration, it is something very standard I copied from the terraform resource page. Nothing particularly strange here.

resource "google_container_cluster" "cluster" {
  name = "${var.cluster-name}-${terraform.workspace}"
  location = var.region
  initial_node_count = 1
  project = var.project-id
  remove_default_node_pool = true
}

resource "google_container_node_pool" "cluster_node_pool" {
  name       = "${var.cluster-name}-${terraform.workspace}-node-pool"
  location   = var.region
  cluster    = google_container_cluster.cluster.name
  node_count = 1

  node_config {
    preemptible  = true
    machine_type = "e2-medium"
    service_account = google_service_account.default.email
    oauth_scopes    = [
      "https://www.googleapis.com/auth/cloud-platform"
    ]
  }
}

This terraform snippet used to work fine. In order to not burn through the 300$ too quickly, at the end of every day I used to destroy the cluster with terraform destroy. However one day the kubernetes cluster creation just stopped working. Here is the error:

Error: googleapi: Error 403: Insufficient regional quota to satisfy request: resource "SSD_TOTAL_GB": request requires '300.0' and is short '50.0'. project has a quota of '250.0' with '250.0' available. View and manage quotas at https://console.cloud.google.com/iam-admin/quotas?usage=USED&project=xxxxxx., forbidden

It looks like something didn't get cleaned up after all the terraform destroy and eventually some quota built up and I am not able to create a cluster anymore. I am still able to create a cluster through the google cloud web interface (I tried only with autopilot, and in the same location). I am a bit puzzled why this is happening. Is my assumption correct? Do I need to delete something that doesn't get deleted automatically with terraform? if yes why? Is there a way to fix this and be able to create the cluster with terraform again?

Mayotte answered 17/12, 2022 at 17:48 Comment(0)
G
13

I ran into the same issue and I think I figured out what's going. The crucial thing here is to understand the difference between zonal and regional clusters.

tldr; A zonal cluster operates in only zone, where a regional cluster may be replicated across multiple zones.

From the doc,

By default, GKE replicates each node pool across three zones of the control plane's region

I think this is why we're seeing the requirement going to 300GB (3 * 100GB), where the --disk-size defaults to 100GB.

The solution is to set the location to a zone than a region. Of course, here I'm assuming a zonal cluster would satisfy your requirements. E.g.

resource "google_container_cluster" "cluster" {
  name = "${var.cluster-name}-${terraform.workspace}"
  location = "us-central1-f"
  initial_node_count = 1
  project = var.project-id
  remove_default_node_pool = true
}

resource "google_container_node_pool" "cluster_node_pool" {
  name       = "${var.cluster-name}-${terraform.workspace}-node-pool"
  location   = "us-central1-f"
  cluster    = google_container_cluster.cluster.name
  node_count = 1

  node_config {
    ...
  }
}
Gravely answered 7/2, 2023 at 20:41 Comment(4)
amazing! that was indeed the issue! I am impressed how you could figure it out without actually seeing the content of my var.region variable! well done! and thank you!Mayotte
Haha the variable name gave it away a bit plus you mentioned you used the terraforn example (which I also did). I just realized that you said the script used to work and one day suddenly stopped working. Interesting. Was it because you changed from a zone to a region at some point or I wonder if something in GCP changed?Gravely
I think I may inadvertently started using the variable or changed it, i don't remember exactly. Thank you!Mayotte
please note that this config is set to zonal: location = "us-central1-f" which I think region is better option for production. And location = us-central1 still work. You should set node_config (in google_container_node_pool) with disk_size_gb for example to 10 Gi to fix this.Shipping
F
3

I had a similar issue that was resolved by moving my clusters to a different region. You can view the quotas for a region by replacing $PROJECT-ID$ with your project-id https://console.cloud.google.com/iam-admin/quotas?usage=USED&project=$PROJECT-ID$ and heading to that link.

If you filter the list to persistent disk ssd (GB) you should see a list of all the available regions along with their quotas.

Filter to persistent disk ssd (gb)

Hope that helps.

enter image description here

Otherwise, your best bet is to request a quota increase for the region you desire.

Flora answered 25/12, 2022 at 22:26 Comment(5)
it is everywhere zero under usage. Also switching zone will eventually run out of zones?Mayotte
@Mayotte not sure I completely understand your comment. The error you're receiving indicates that the zone/region you're in has a limit of 250 GB and you need at least 300 GB to start up a cluster. In my screenshots, you can see that the zone asia-east2-a has a cap of unlimited so I would probably be able to spin up my cluster there.Flora
ah! you are right. Maybe I could change some config like the machine type? I wonder why it was working before. I am quite sure I was using the same configuration.Mayotte
@Mayotte did this solve your issue?Flora
no unfortunately it didn't work. I tried specifying both machine_type = "e2-micro" and machine_type = "n1-standard-1" and got the same error message. The only way I could get it to work so far is by using the autopilot mode.Mayotte
L
2

Changing the disk type from SSD(default) to Standard Persistent Disk makes it work:

--disk-type pd-standard
Logroll answered 10/7, 2023 at 20:47 Comment(0)
C
1

Go to Compute Engine > Disks, and check if there are any disks in the specified region which is consuming the quota.

This error says that the request will require 300GB of SSD and you have a quota of 250 GB in that region. This error generally occurs when the quota is exhausted. You can read more about the type of disk quotas here. You can also request a quota increase if you want.

I am not able to understand why this request needs 300 GB of SSD as I am not quite familiar with Terraform. From the code, it seems that you are creating only one node. As per the terraform doc, the "disk_size_gb" defaults to 100GB. So, it should take only 100 GB. Try to set a smaller size in the "disk_size_gb" in "node_config" and check if it helps.

Contemptible answered 18/12, 2022 at 6:17 Comment(1)
Thank you for your answer, (Un)fortunately I don't have any disk created in any region. Hence I am afraid it is not that. I will try create the cluster in a different way.Mayotte
M
1

I was able to fix this by creating the cluster in autopilot mode using enable_autopilot = true rather than manually creating the node_pool (through terraform) and letting google take care of that. However I am afraid that I may have just hidden the problem under the carpet as the cluster may be initially created with a small disk and then it gets scaled up as needed.

Mayotte answered 20/12, 2022 at 19:6 Comment(0)
M
1

I faced same issue today while trying to create the standard cluster at GCP Portal itself, noted the error message below:

Create Kubernetes Engine cluster cluster-1-standard
5 hours ago
My First Project
**Insufficient regional quota to satisfy request: resource "SSD_TOTAL_GB": request requires '300.0'** and is short '50.0'. project has a quota of '250.0' with '250.0' available. View and manage quotas at https://console.cloud.google.com/iam-admin/quotas?usage=USED&project=ecstatic-splice-400311.

However after I read thru this post, realized a way to cut down (i.e. less than 300gb) the total size being requested as a workaround. So to do that,

  1. Navigate to Node Pools below cluster basics on left hand dash menu.
  2. Drop down option default-pool, select Nodes
  3. Now on under "Configure node settings", set the Boot disk size from 100GB default to anything like 75Gb, so that total i.e. 75*3 comes down within free quota. Hence 225 in this case below quota of 250gbcluster with mode - standard
  4. Clicked Create cluster and cluster was created with no other issues.
Makhachkala answered 17/10, 2023 at 8:57 Comment(0)
S
1

Keep cluster in region is better option ( compared with zonal). To scale down disk use ( default is 100Gi for 1 instance). We can set in node_pool config:

resource "google_container_node_pool" "general" {
  # Other config

  node_config {
    machine_type = var.machine_type
    disk_size_gb = 10 # This will set to 10 instead of 100
  }
  
  # Other config
}
Shipping answered 13/12, 2023 at 16:41 Comment(0)
D
1

resource "google_container_cluster" "cluster" {
  name = "${var.cluster-name}-${terraform.workspace}"
  location = "us-central1-f"
  initial_node_count = 1
  project = var.project-id
  remove_default_node_pool = true
  node_config {
    disk_size_gb = "20"
  }
}

add a node_config {} with disk_size_gb which will not exceed when you multiply by 3 will solve the problem if you use a regional cluster. You really don't need to stick with zonal cluster for this.

Dogcart answered 8/3, 2024 at 14:45 Comment(1)
Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.Celebrity
C
0

If you wanna create Kubernetes cluster using standard configuration then make sure you have selected maximum number of nodes as 2 or less only. You need to check "Node pool details". To change or check that follow below steps:

  1. Create Cluster
  2. Select Standard and click configure
  3. Provide cluster name , location of your choice and etc things.
  4. On the left panel locate "NODE POOLS" and click on "default-pool"
  5. Locate or search on the browser page "Number of nodes". You will find text box to add number of nodes. In your case it must more than 2 nodes that's why you are getting error. Other option is to increase the Limit of your desired region and that may need approval from Google as per new policy.
Chamfer answered 4/1, 2023 at 16:42 Comment(1)
since I am trying to create the cluster through terraform, what terraform configuration should I use to achieve what you are suggesting?Mayotte
H
0

I faced similar issue (note that this was not for production and it was just for my study/hands-on practice purpose). Rather than bothering about quota, I just reduced the disc-size (--disk-size "20") as I was "okay" with it since I was just working for study purpose. Below command-line worked for me.

gcloud beta container --project "ace-exam-kubernetes-engine" clusters create "my-kubernetes-cluster-1" --no-enable-basic-auth --cluster-version "1.27.3-gke.100" --release-channel "regular" --machine-type "e2-medium" --image-type "COS_CONTAINERD" --disk-type "pd-balanced" --disk-size "20" --metadata disable-legacy-endpoints=true --scopes "https://www.googleapis.com/auth/devstorage.read_only","https://www.googleapis.com/auth/logging.write","https://www.googleapis.com/auth/monitoring","https://www.googleapis.com/auth/servicecontrol","https://www.googleapis.com/auth/service.management.readonly","https://www.googleapis.com/auth/trace.append" --num-nodes "3" --logging=SYSTEM,WORKLOAD --monitoring=SYSTEM --enable-ip-alias --network "projects/ace-exam-kubernetes-engine/global/networks/default" --subnetwork "projects/ace-exam-kubernetes-engine/regions/us-central1/subnetworks/default" --no-enable-intra-node-visibility --default-max-pods-per-node "110" --security-posture=standard --workload-vulnerability-scanning=disabled --no-enable-master-authorized-networks --addons HorizontalPodAutoscaling,HttpLoadBalancing,GcePersistentDiskCsiDriver --enable-autoupgrade --enable-autorepair --max-surge-upgrade 1 --max-unavailable-upgrade 0 --binauthz-evaluation-mode=DISABLED --enable-managed-prometheus --enable-shielded-nodes --node-locations "us-central1-c" --zone "us-central1-c"

Note : This was a "Standard" cluster and not "Autopilot" cluster.

Hower answered 21/11, 2023 at 6:22 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.