Azure AKS Public IP in Non-standard Resource Group
Asked Answered
L

3

9

I've been trying to manage an Azure Kubernetes Service (AKS) instance via Terraform. When I create the AKS instance via the Azure CLI per this MS tutorial, then install an ingress controller with a static public IP, per this MS tutorial, everything works fine. This method implicitly creates a service principal (SP).

When I create an otherwise exact duplicate of the AKS cluster via Terraform, I am forced to supply the service principal explicitly. I gave this new SP "Contributor" access to the cluster's entire resource group yet, when I get to the step to create the ingress controller (using the same command that tutorial 2 provided, above: helm install stable/nginx-ingress --set controller.replicaCount=2 --set controller.service.loadBalancerIP="XX.XX.XX.XX"), the ingress service comes up but it never acquires its public IP. The IP status remains "<pending>" indefinitely, and I can find nothing in any log about why. Are there logs that should tell me why my IP is still pending?

Again, I am fairly certain that, other than the SP, the Terraform AKS cluster is an exact duplicate of the one created based on the MS tutorial. Running terraform plan finds no differences between the two. Does anyone have any idea what permission my AKS SP might need or what else I might be missing here? Strangely, I can't find ANY permissions assigned to the implicitly created principal via the Azure portal, but I can't think of anything else that might be causing this behavior.

Not sure if it's a red herring or not, but other users have complained about a similar problem in the context of issues opened against the second tutorial. Their fix always appears to be "tear down your cluster and retry", but that isn't an acceptable solution in this context. I need a reproducible working cluster and azurerm_kubernetes_cluster doesn't currently allow for building an AKS instance with an implicitly created SP.

Loxodrome answered 13/5, 2019 at 13:14 Comment(1)
You can mark your answer to help other members.Unsphere
L
15

I'm going to answer my own question, for posterity. It turns out the problem was the resource group where I created the static public IP. AKS clusters use two resource groups: the group that you explicitly created the cluster in, and a second group which is implicitly created by the cluster. That second, implicit resource group always gets a name starting with "MC_" (the rest of the name is derivative of the explicit RG, the cluster name, and the region).

Anyhow, the default AKS configuration requires that the public IP be created within that implicit resource group. Assuming that you created the AKS cluster with Terraform, its name will be exported in ${azurerm_kubernetes_cluster.NAME.node_resource_group}.

EDIT 2019-05-23

Since writing this, we found a use case that the workaround of using the MC_* resource group wasn't good enough for. I opened a support ticket with MS and they directed me to this solution. Add the following annotation to your LoadBalancer (or Ingress controller), and make sure that the AKS SP has at least Network Contributor rights in the destination resource group (myResourceGroup in the example below):

metadata:
  annotations:
    service.beta.kubernetes.io/azure-load-balancer-resource-group: myResourceGroup

This solved it completely for us.

Loxodrome answered 13/5, 2019 at 19:20 Comment(5)
thats not exactly true, you can use IP address from any resource group in the same subscription given the right setupThornton
Yes, @4c74356b41. I followed up with MS and this requires an annotation on the load balancer or ingress controller. I've edited my original post to include this solution.Loxodrome
Didn't. Changed my mind about an upvote because you didn't direct me to a solution but only reported that there was one, which wasn't very helpful. I only followed up with a support ticket to MS because we found a use case that wasn't solved by my original workaround.Loxodrome
well, i dont know how you configured your stuff, so I wouldn't know how to fix it, but its quite obvious that everything works fine if configured properly so you need to fix the SP rightsThornton
The SP rights were fine from the start. That was the first thing I fixed. The problem was a missing annotation. I've edited my answer to include the annotation. Thanks for mentioning the SP rights again, though. I will add that to my answer too. (You get an upvote for that. :)Loxodrome
S
2

Set Static IP Resource Group when Installing Helm Chart

Here is a minimal helm install command for nginx-controller that works when the static IP is in a different resource group than the cluster managed node resource group.

helm upgrade --install ingress-nginx ingress-nginx \
  --repo https://kubernetes.github.io/ingress-nginx \
  --namespace ingress-nginx \
  --set controller.replicaCount=1 \
  --set controller.service.externalTrafficPolicy=Local \
  --set controller.service.loadBalancerIP=$ingress_controller_ip \
  --set controller.service.annotations."service\.beta\.kubernetes\.io/azure-load-balancer-resource-group"=$STATIC_IP_ROSOURCE_GROUP

The key is the last override to provide the resource group of the static IP.

Also, note that you may need to customize the load balancer health probe if your root path doesn't return a successful http response. We do this by additionally adding the following (replace /healthz with your probe EP):

Additional Note: Health Probe Endpoints

--set controller.service.annotations."service\.beta\.kubernetes\.io/azure-load-balancer-health-probe-request-path"=/healthz

Versions

Kubernetes 1.22.6
ingress-nginx-4.1.0
ingress-nginx/controller:v1.2.0
Shawana answered 29/4, 2022 at 18:1 Comment(0)
U
1

I can't comment just yet so putting this addition as answer.

Derek is right, you can totally use existing IP from a resource group different to where AKS cluster was provisioned. There is the documentation page. Just make sure you've done these two steps below:

  1. Add "Network Contributor" role assignment for your AKS service principal to the resource group where your existing static IP is.

  2. Add service.beta.kubernetes.io/azure-load-balancer-resource-group: myResourceGroup to the ingress controller with the following command:

kubectl annotate service ingress-nginx-controller -n ingress service.beta.kubernetes.io/azure-load-balancer-resource-group=datagate
Union answered 9/3, 2021 at 0:5 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.