Not authorized to perform sts:AssumeRoleWithWebIdentity- 403
Asked Answered
F

10

34

I have been trying to run an external-dns pod using the guide provided by k8s-sig group. I have followed every step of the guide, and getting the below error.

time="2021-02-27T13:27:20Z" level=error msg="records retrieval failed: failed to list hosted zones: WebIdentityErr: failed to retrieve credentials\ncaused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: 87a3ca86-ceb0-47be-8f90-25d0c2de9f48"

I had created AWS IAM policy using Terraform, and it was successfully created. Except IAM Role for service account for which I had used eksctl, everything else has been spun via Terraform.

But then I got hold of this article which says creating AWS IAM policy using awscli would eliminate this error. So I deleted the policy created using Terraform, and recreated it with awscli. Yet, it is throwing the same error error.

Below is my external dns yaml file.

apiVersion: v1
kind: ServiceAccount
metadata:
  name: external-dns
  # If you're using Amazon EKS with IAM Roles for Service Accounts, specify the following annotation.
  # Otherwise, you may safely omit it.
  annotations:
    # Substitute your account ID and IAM service role name below.
    eks.amazonaws.com/role-arn: arn:aws:iam::268xxxxxxx:role/eksctl-ats-Eks1-addon-iamserviceaccoun-Role1-WMLL93xxxx
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: external-dns
rules:
- apiGroups: [""]
  resources: ["services","endpoints","pods"]
  verbs: ["get","watch","list"]
- apiGroups: ["extensions","networking.k8s.io"]
  resources: ["ingresses"]
  verbs: ["get","watch","list"]
- apiGroups: [""]
  resources: ["nodes"]
  verbs: ["list","watch"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: external-dns-viewer
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: external-dns
subjects:
- kind: ServiceAccount
  name: external-dns
  namespace: default
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: external-dns
spec:
  strategy:
    type: Recreate
  selector:
    matchLabels:
      app: external-dns
  template:
    metadata:
      labels:
        app: external-dns
    spec:
      serviceAccountName: external-dns
      containers:
      - name: external-dns
        image: k8s.gcr.io/external-dns/external-dns:v0.7.6
        args:
        - --source=service
        - --source=ingress
        - --domain-filter=xyz.com # will make ExternalDNS see only the hosted zones matching provided domain, omit to process all available hosted zones
        - --provider=aws
        - --policy=upsert-only # would prevent ExternalDNS from deleting any records, omit to enable full synchronization
        - --aws-zone-type=public # only look at public hosted zones (valid values are public, private or no value for both)
        - --registry=txt
        - --txt-owner-id=Z0471542U7WSPZxxxx
      securityContext:
        fsGroup: 65534 # For ExternalDNS to be able to read Kubernetes and AWS token files

I am scratching my head as there is no proper solution to this error anywhere in the net. Hoping to find a solution to this issue in this forum.

End result must show something like below and fill up records in hosted zone.

time="2020-05-05T02:57:31Z" level=info msg="All records are already up to date"
Fate answered 28/2, 2021 at 4:19 Comment(2)
which version of Terraform are you running? I am currently experiencing this same issue with v0.12.24Inchoate
@RyanWalden I am using Terraform v0.14Fate
U
42

I also struggled with this error.

The problem was in the definition of the trust relationship.

You can see in some offical aws tutorials (like this) the following setup:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_PROVIDER}"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "${OIDC_PROVIDER}:sub": "system:serviceaccount:<my-namespace>:<my-service-account>"
        }
      }
    }
  ]
}

Option 1 for failure

My problem was that I passed the a wrong value for my-service-account at the end of ${OIDC_PROVIDER}:sub in the Condition part.

Option 2 for failure

After the previous fix - I still faced the same error - it was solved by following this aws tutorial which shows the output of using the eksctl with the command below:

eksctl create iamserviceaccount \
                --name my-serviceaccount \
                --namespace <your-ns> \
                --cluster <your-cluster-name> \
                --attach-policy-arn arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess \
                --approve

When you look at the output in the trust relationship tab in the AWS web console - you can see that an additional condition was added with the postfix of :aud and the value of sts.amazonaws.com:

enter image description here

So this need to be added after the "${OIDC_PROVIDER}:sub" condition.

Unreserve answered 5/5, 2021 at 11:53 Comment(3)
Thank you so much. This was so helpful. Providing OIDC permission is not even given in official documentation which was quite confusing. Your comment really saved me :)Goldi
I have the exactly problem related for the option 1, I've configured the wrong name for the service account in the condition in trust relationship, editing the trust relationship with the correct name in my role works.Tva
This was very useful, thank you! In my condition clause I had "system:serviceaccount:*:*" but apparently wildcards do not work in this context. I had to specify the actual namespace and service account to get rid of AccessDeniedBenavides
I
5

I was able to get help from the Kubernetes Slack (shout out to @Rob Del) and this is what we came up with. There's nothing wrong with the k8s rbac from the article, the issue is the way the IAM role is written. I am using Terraform v0.12.24, but I believe something similar to the following .tf should work for Terraform v0.14:

data "aws_caller_identity" "current" {}

resource "aws_iam_role" "external_dns_role" {
  name = "external-dns"

  assume_role_policy = jsonencode({
    "Version": "2012-10-17",
    "Statement": [
      {
        "Effect": "Allow",
        "Principal": {
          "Federated": format(
            "arn:aws:iam::${data.aws_caller_identity.current.account_id}:%s", 
            replace(
              "${aws_eks_cluster.<YOUR_CLUSTER_NAME>.identity[0].oidc[0].issuer}", 
              "https://", 
              "oidc-provider/"
            )
          )
        },
        "Action": "sts:AssumeRoleWithWebIdentity",
        "Condition": {
          "StringEquals": {
            format(
              "%s:sub", 
              trimprefix(
                "${aws_eks_cluster.<YOUR_CLUSTER_NAME>.identity[0].oidc[0].issuer}", 
                "https://"
              )
            ) : "system:serviceaccount:default:external-dns"
          }
        }
      }
    ]
  })
}

The above .tf assume you created your eks cluster using terraform and that you use the rbac manifest from the external-dns tutorial.

Inchoate answered 12/3, 2021 at 18:26 Comment(0)
A
1

Similar to what @Rot-man described I had the wrong prefix for ${OIDC_PROVIDER}. I had included an arn prefix which I needed to remove so replacing:

"arn:aws:iam::123456789100:oidc-provider/oidc.eks.eu-region-1.amazonaws.com/id/5942D3B4F3E74660A6688F6D05FE40C5:sub": "system:serviceaccount:kube-system:ebs-csi-controller-sa"

with:

"oidc.eks.eu-region-1.amazonaws.com/id/5942D3B4F3E74660A6688F6D05FE40C5:sub": "system:serviceaccount:kube-system:ebs-csi-controller-sa"

to end up with:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Federated": "arn:aws:iam::123456789100:oidc-provider/oidc.eks.eu-region-1.amazonaws.com/id/5942D3B4F3E74660A6688F6D05FE40C5"
            },
            "Action": "sts:AssumeRoleWithWebIdentity",
            "Condition": {
                "StringEquals": {
                    "oidc.eks.eu-region-1.amazonaws.com/id/5942D3B4F3E74660A6688F6D05FE40C5:sub": "system:serviceaccount:kube-system:ebs-csi-controller-sa",
                    "oidc.eks.eu-region-1.amazonaws.com/id/5942D3B4F3E74660A6688F6D05FE40C5:aud": "sts.amazonaws.com"   
                }
            }
        }
    ]
}

Values for account id, region and uuid in the examples above are fake so replace with your own.

Andrel answered 25/7, 2023 at 13:2 Comment(0)
S
1

This can also happen if you have a typo in the role you are attempting to assume with the service account, i.e. the role name in the annotation doesn't match the role name in AWS IAM.

For example if your service account had the annotation

    eks.amazonaws.com/role-arn: arn:aws:iam::12345678:role/external-dns-service-account-oidc-role

However, the actual role in AWS had the ARN

eks.amazonaws.com/role-arn: arn:aws:iam::12345678:role/external-dns-service-account

You would encounter this issue

Skewbald answered 23/11, 2023 at 16:20 Comment(0)
F
0

I have a few possibilities here.

Before anything else, does your cluster have an OIDC provider associated with it? IRSA won't work without it.

You can check that in the AWS console, or via the CLI with:

aws eks describe-cluster --name {name} --query "cluster.identity.oidc.issuer"

First

Delete the iamserviceaccount, recreate it, remove the ServiceAccount definition from your ExternalDNS manfiest (the entire first section) and re-apply it.

eksctl delete iamserviceaccount --name {name} --namespace {namespace} --cluster {cluster}
eksctl create iamserviceaccount --name {name} --namespace {namespace} --cluster 
{cluster} --attach-policy-arn {policy-arn} --approve --override-existing-serviceaccounts
kubectl apply -n {namespace} -f {your-externaldns-manifest.yaml}

It may be that there is some conflict going on as you have overwritten what you created with eksctl createiamserviceaccount by also specifying a ServiceAccount in your ExternalDNS manfiest.

Second

Upgrade your cluster to v1.19 (if it's not there already):

eksctl upgrade cluster --name {name} will show you what will be done;

eksctl upgrade cluster --name {name} --approve will do it

Third

Some documentation suggests that in addition to setting securityContext.fsGroup: 65534, you also need to set securityContext.runAsUser: 0.

Faction answered 2/3, 2021 at 20:35 Comment(5)
Good Morning! I have a doubt. In the service account section (of the manifest), I am referring to service account created with eksctl (annotation). Or am I overriding eksctl’s creation?Fate
@KrisT you are overwriting what you created with eksctl createiamserviceaccount.Faction
Also @KrisT, just to confirm, you do have an OIDC provider associated with this cluster correct?Faction
I just checked my terraform’s eks module config, and found that irsa is disabled. Based on your comment, it looks like I must enable it. I hope this would be enough. And I will try out your suggestion on commenting out SA section in ext-dns yaml manifest.Fate
assuming @KrisT provisioned the cluster with terraform, the upgrade should be performed by changing the version value in the terraform module or resource and then applying instead of upgrading through eksctl since the later I believe would disrupt the state file.Inchoate
B
0

I've been struggling with a similar issue after following the setup suggested here

I ended up with the exception below in the deploy logs.

time="2021-05-10T06:40:17Z" level=error msg="records retrieval failed: failed to list hosted zones: WebIdentityErr: failed to retrieve credentials\ncaused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: 3fda6c69-2a0a-4bc9-b478-521b5131af9b"
time="2021-05-10T06:41:20Z" level=error msg="records retrieval failed: failed to list hosted zones: WebIdentityErr: failed to retrieve credentials\ncaused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: 7d3e07a2-c514-44fa-8e79-d49314d9adb6"

In my case, it was an issue with wrong Service account name mapped to the new role created.

Here is a step by step approach to get this done without much hiccups.

  1. Create the IAM Policy
{
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Action": [
            "route53:ChangeResourceRecordSets"
          ],
          "Resource": [
            "arn:aws:route53:::hostedzone/*"
          ]
        },
        {
          "Effect": "Allow",
          "Action": [
            "route53:ListHostedZones",
            "route53:ListResourceRecordSets"
          ],
          "Resource": [
            "*"
          ]
        }
      ]
    }
  1. Create the IAM role and the service account for your EKS cluster.
eksctl create iamserviceaccount \
    --name external-dns-sa-eks \
    --namespace default \
    --cluster aecops-grpc-test \
    --attach-policy-arn arn:aws:iam::xxxxxxxx:policy/external-dns-policy-eks  \
    --approve 
    --override-existing-serviceaccounts
  1. Created new hosted zone.

aws route53 create-hosted-zone --name "hosted.domain.com." --caller-reference "grpc-endpoint-external-dns-test-$(date +%s)"

  1. Deploy ExternalDNS, after creating the Cluster role and Cluster role binding to the previously created service account.
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: external-dns
rules:
- apiGroups: [""]
  resources: ["services","endpoints","pods"]
  verbs: ["get","watch","list"]
- apiGroups: ["extensions","networking.k8s.io"]
  resources: ["ingresses"]
  verbs: ["get","watch","list"]
- apiGroups: [""]
  resources: ["nodes"]
  verbs: ["list","watch"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: external-dns-viewer
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: external-dns
subjects:
- kind: ServiceAccount
  name: external-dns-sa-eks
  namespace: default
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: external-dns
spec:
  strategy:
    type: Recreate
  selector:
    matchLabels:
      app: external-dns
  template:
    metadata:
      labels:
        app: external-dns
      # If you're using kiam or kube2iam, specify the following annotation.
      # Otherwise, you may safely omit it.
      annotations:
        iam.amazonaws.com/role: arn:aws:iam::***********:role/eksctl-eks-cluster-name-addon-iamserviceacco-Role1-156KP94SN7D7
    spec:
      serviceAccountName: external-dns-sa-eks
      containers:
      - name: external-dns
        image: k8s.gcr.io/external-dns/external-dns:v0.7.6
        args:
        - --source=service
        - --source=ingress
        - --domain-filter=hosted.domain.com. # will make ExternalDNS see only the hosted zones matching provided domain, omit to process all available hosted zones
        - --provider=aws
        - --policy=upsert-only # would prevent ExternalDNS from deleting any records, omit to enable full synchronization
        - --aws-zone-type=public # only look at public hosted zones (valid values are public, private or no value for both)
        - --registry=txt
        - --txt-owner-id=my-hostedzone-identifier
      securityContext:
        fsGroup: 65534 # For ExternalDNS to be able to read Kubernetes and AWS token files
  1. Update Ingress resource with the domain name and reapply the manifest.

For ingress objects, ExternalDNS will create a DNS record based on the host specified for the ingress object.

- host: myapp.hosted.domain.com

  1. Validate new records created.
BASH-3.2$ aws route53 list-resource-record-sets --output json
--hosted-zone-id "/hostedzone/Z065*********" --query "ResourceRecordSets[?Name == 'hosted.domain.com..']|[?Type == 'A']"

[
    {
        "Name": "myapp.hosted.domain.com..",
        "Type": "A",
        "AliasTarget": {
            "HostedZoneId": "ZCT6F*******",
            "DNSName": "****************.elb.ap-southeast-2.amazonaws.com.",
            "EvaluateTargetHealth": true
        }
    } ]
Bunko answered 10/5, 2021 at 15:21 Comment(0)
R
0

In our case this issue occurred when using the Terraform module to create the eks cluster, and eksctl to create the iamserviceaccount for the aws-load-balancer controller. It all works fine the first go-round. But if you do a terraform destroy, you need to do some cleanup, like delete the CloudFormation script created by eksctl. Somehow things got crossed, and the CloudTrail was passing along a resource role that was no longer valid. So check the annotation of the service account to ensure it's valid, and update it if necessary. Then in my case I deleted and redeployed the aws-load-balancer-controller

%> kubectl describe serviceaccount aws-load-balancer-controller -n kube-system        
Name:                aws-load-balancer-controller
Namespace:           kube-system
Labels:              app.kubernetes.io/managed-by=eksctl
Annotations:         eks.amazonaws.com/role-arn: arn:aws:iam::212222224610:role/eksctl-ch-test-addon-iamserviceaccou-Role1-JQL4R3JM7I1A
Image pull secrets:  <none>
Mountable secrets:   aws-load-balancer-controller-token-b8hw7
Tokens:              aws-load-balancer-controller-token-b8hw7
Events:              <none>
%>

%> kubectl annotate --overwrite serviceaccount aws-load-balancer-controller eks.amazonaws.com/role-arn='arn:aws:iam::212222224610:role/eksctl-ch-test-addon-iamserviceaccou-Role1-17A92GGXZRY6O' -n kube-system
Rafaellle answered 24/8, 2021 at 4:40 Comment(0)
K
0

In my case, I was able to attach the oidc role with route53 permissions policy and that resolved the error.

https://medium.com/swlh/amazon-eks-setup-external-dns-with-oidc-provider-and-kube2iam-f2487c77b2a1

and then with the external-dns service account used that instead of the cluster role.

  annotations:
  #   # Substitute your account ID and IAM service role name below.
    eks.amazonaws.com/role-arn: arn:aws:iam::<account>:role/external-dns-service-account-oidc-role
Kwiatkowski answered 4/4, 2022 at 18:8 Comment(0)
K
0

For me the issue was that the trust relationship was (correctly) setup using one partition whereas the ServiceAccount was annotated with a different partition, like so:

...
"Principal": {
    "Federated": "arn:aws-us-gov:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_PROVIDER}"
},
...
kind: ServiceAccount
metadata:
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::{{ .Values.aws.account }}:role/{{ .Values.aws.roleName }}

Notice arn:aws:iam vs arn:aws-us-gov:iam

Kra answered 2/12, 2022 at 20:4 Comment(0)
J
0

After spending a day on this... it was the StringEquals in Terraform:

StringEquals = {
        "${trimprefix(module.eks.cluster_oidc_issuer_url, "https://")}:sub" : "system:serviceaccount:kube-system:aws-load-balancer-controller",
        "${trimprefix(module.eks.cluster_oidc_issuer_url, "https://")}:aud" : "sts.amazonaws.com"
      }

notice the use of trimprefix. Otherwise if you simply use module.eks.cluster_oidc_issuer_url you will get the https:// which causes it to fail.

Joinder answered 23/4, 2024 at 13:40 Comment(1)
Also, I found this by editing the policy in the console where it gave errors on the lines where https appeared.Joinder

© 2022 - 2025 — McMap. All rights reserved.