AWS fargate tasks won't start reliably
Asked Answered
H

1

1

I have an ECS cluster with a bunch of different tasks in it (using the same docker image but with different environment variables).

Some of the tasks come up without problem but others fail a lot even though i've used the same VPC, subnet and security-group. The error message shows ResourceInitializationError: unable to pull secrets or registry auth: execution resource retrieval failed: unable to retrieve ecr registry auth: service call has been retried 3 time(s): RequestError: send request failed caused by: Post https://api.ecr..

Bizarre is that the same task sometimes comes up if i create a new task definition or delete the ECR repository and re-upload the docker image.

I'm unable to draw any conclusion out of this..

Update: strange... the task starts successfully when i deregister the task definition and recreate it with the same specs. But only once..

Haag answered 25/3, 2022 at 16:3 Comment(2)
Could it be possible some of the subnets don't have a route to ECR? Like a subnet without a route to an Internet Gateway, NAT Gateway, or ECR VPC endpoint?Azikiwe
i don't think so, if i recreate the task definition every time before i start the task it starts successfully. However if i stop the task and try to restart it the task fails..Haag
H
0

It turns out one have to select the taskExecution role on Task Role - override and Task Execution Role - override in the run task Advanced Options section when starting the task. I don't know why it was arbitrarily working when randomly trying or working when i recreated the task definition every time.

Haag answered 28/3, 2022 at 13:45 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.