I asked and answered this on re:Post, but I will summarize things here to help the next lost soul who is trying to solve this.
First, let's use an example account setup like this:
artifacts
: the AWS account where ECR lives
production
: the AWS account where EKS lives
In the artifacts
account, we have a repository that we will call example-repo
, and in the production
account, we have an IAM role which is in use by the EKS nodes named prod-eks-node-role
. To make a problem statement: we need EKS in production
to be able to pull images from the ECR example-repo
in artifacts
.
When EKS in production
, via the prod-eks-node-role
, attempts to access the example-repo
ECR repository in the artifacts
account, the auth workflow looks like this:
Step 1: Generate ECR Authorization Token in the Same Account
prod-eks-node-role
calls the equivalent of aws ecr get-login-password
within its own account in production
. To do this, prod-eks-node-role
must have a policy attached to it like this:
{
"Statement": [
{
"Action": "ecr:GetAuthorizationToken",
"Effect": "Allow",
"Resource": "*",
"Sid": "AllowGetAuthToken"
}
],
"Version": "2012-10-17"
}
At first glance, this seems odd: why are we generating an authorization token within the production
account in order to access the artifacts
ECR registry?
The reason is that ECR authorization tokens (Docker login passwords) uniquely identify the given IAM principal regardless of AWS account. When a password is generated, AWS correlates the IAM principal (in our case, prod-eks-node-role
in production
) with the password. Then, when prod-eks-node-role
attempts to perform actions on the ECR repository (i.e. pull/push), the generated password tells ECR who is accessing it.
Now, when EKS in production
attempts to pull an image from the ECR example-repo
repository in artifacts
, it now checks the ECR repository policy. We can think of this first step as authentication or identifying the user, and this step is effectively global across all AWS accounts.
Step 2: Attempt to Perform Actions Against ECR in the Other Account
Next, IAM in the artifacts
account checks the ECR example-repo
repository policy for whether the authenticated/identified IAM principal can perform actions. This step is authorization.
In our case, we simply want to allow EKS to pull images from our ECR repository named example-repo
. To do this, we need to define an ECR repository policy which allows the IAM principal (prod-eks-node-role
) to perform the related pull actions. Define this on the example-repo
:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowPull",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::987654321098:role/prod-ecr-pull-role"
},
"Action": [
"ecr:BatchCheckLayerAvailability",
"ecr:BatchGetImage",
"ecr:DescribeRepositories",
"ecr:GetDownloadUrlForLayer",
"ecr:ListImages"
]
}
]
}
Replace the ARN above with the ARN of your role.
After defining this, EKS should be able to:
- ✅ authenticate with ECR globally to identify itself
- ✅ authorize with the ECR repository
example-repo
in the artifacts
account
And we're full circle now.
Example
For an example of how this workflow will work, we can do it with the AWS CLI. We will assume that we have an AWS CLI profile named production
for the account where EKS lives, and a profile named artifacts
for the account where ECR lives.
Start by discovering what IAM principal you're using:
aws --profile production sts get-caller-identity | jq -rS '.Arn'
The output from this command must be allowed in the ECR repository policy. Do that first, and then continue.
Next, let's get a password for ECR:
DOCKER_PASSWORD="$(aws --profile production ecr get-login-password)"
Note that the command does not identify a target account; it runs with the credentials in the production profile against ECR globally.
If the above step fails, ensure that your IAM principal (user/role) has the ability to ecr:GetAuthorizationToken
.
Next, we will use the password to authenticate our Docker daemon with the specific registry in the artifacts
account:
ARTIFACTS_ACCOUNT_ID=123456789012
ARTIFACTS_ECR_REGION=us-east-1
docker login -u AWS --password "$DOCKER_PASSWORD" \
"${ARTIFACTS_ACCOUNT_ID}.dkr.ecr.${ARTIFACTS_ECR_REGION}.amazonaws.com"
If you obtained a valid password, the above step should not fail in any case, as the authentication is global across all AWS accounts.
Finally, let's try pulling an image:
docker pull "${ARTIFACTS_ACCOUNT_ID}.dkr.ecr.${ARTIFACTS_ECR_REGION}/example-repo:latest"
If this doesn't work, ensure that the repository exists, the Docker image tag latest
exists, and the ECR repository policy allows your IAM principal to use the IAM actions required for pulling.
Summary
To allow EKS in the artifacts
account to pull images from an ECR repository in the production
account, we must:
- Attach a policy to the EKS node IAM role that allows the role to obtain credentials that uniquely identify it globally. The permission is
ecr:GetAuthorizationToken
and must be granted on "Resource": ["*"]
in the production
account.
- Create an ECR repository policy on the target repository which allows the EKS node IAM role to perform the actions it needs to pull (or push) images.