AWS Glue Access denied for crawler with administrator policy attached
Asked Answered
S

7

10

I am trying to run a crawler across an s3 datastore in my account which contains two csv files. However, when I try to run the crawler, no tables are loaded, and I see the following errors in cloudwatch for the each of the files:

  • Error Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied;
  • Tables created did not infer schemas from this file.

This is especially odd as the IAM role has the AdministratorAccess policy attached, so there should not be any access denied issue.

Any help would be appreciated.

Sleepyhead answered 17/8, 2018 at 16:19 Comment(0)
S
-1

I made sure that I wasn't missing something offered in the other suggestions, but I wasn't. It turns out there was another level of restrictions on reading the bucket imposed by my organization, though i'm not sure what it was.

Sleepyhead answered 20/8, 2018 at 20:16 Comment(0)
A
16

Check to see if the files you are crawling are encrypted. If they are, then your Glue role probably doesn't have a policy that allows it to decrypt.

If so, it might need something like this:

{
  "Version": "2012-10-17",
  "Statement": {
    "Effect": "Allow",
    "Action": [
      "kms:Decrypt"
    ],
    "Resource": [
      "arn:aws:kms:us-west-2:111122223333:key/1234abcd-12ab-34cd-56ef-1234567890ab",
      "arn:aws:kms:us-west-2:111122223333:key/0987dcba-09fe-87dc-65ba-ab0987654321"
    ]
  }
}
Antonelli answered 18/8, 2018 at 5:46 Comment(0)
B
9

Make sure the policies attached to you IAM role have these :

  1. AmazonS3FullAccess
  2. AwsGlueConsoleFullAccess
  3. AwsGlueServicerole.
Buhler answered 17/8, 2018 at 20:4 Comment(3)
And if files are encrypted with KMS key then it requires to have Decode permission for KMSSelfannihilation
can you explain why does awsglueconsolefullaccess necessaryAnadiplosis
AmazonS3FullAccess is a very loose policy. You can tighten this up by including a customer managed policy that allows s3:ListBucket and s3:GetObject for the bucket that contains your data. See the note under step 5 of: docs.aws.amazon.com/glue/latest/dg/create-an-iam-role.htmlGoebbels
D
3

We had a similar issue with an S3 crawler. According to AWS, S3 crawlers, unlike JDBC crawlers, do not create an ENI in your VPC. This means your bucket policy must allow access from outside the VPC.

Check that your bucket policy does not have an explicit deny somewhere on S3:*. If there is one, make sure to add a conditional on the statement and add the role id in the conditional as aws:userId in the statement. Keep in mind the role id and role arn is not the same thing.

To get the role id:

aws iam get-role --role-name Test-Role

Output:

{
  "Role": {
      "AssumeRolePolicyDocument": "<URL-encoded-JSON>",
      "RoleId": "AIDIODR4TAW7CSEXAMPLE",
      "CreateDate": "2013-04-18T05:01:58Z",
      "RoleName": "Test-Role",
      "Path": "/",
      "Arn": "arn:aws:iam::123456789012:role/Test-Role"
  }
}  

You might also need to add a state that allows s3:putObject* and s3:getObject* with the aws principal the assumed role. The assumed role will look something like:

arn:aws:sts::123456789012:assumed-role/Test-Role/AWS-Crawler

Hope this helps.

Deepfreeze answered 24/1, 2019 at 4:45 Comment(0)
G
1

In my case the issue was: the crawler was configured in different region than S3 bucket it meant to crawl. After configuring new crawler in the same region as my S3 bucket the problem was resolved.

Generalization answered 1/8, 2020 at 16:11 Comment(0)
E
0

This is an S3 bucket policy issue. I made my tables public (bad policy I know) and it worked.

Epigenous answered 3/8, 2020 at 11:38 Comment(0)
J
0

IAM Roles Here are the complete roles you need to give in order for Glue Crawler to work properly.

IAM Roles

Jamboree answered 14/9, 2022 at 11:29 Comment(0)
S
-1

I made sure that I wasn't missing something offered in the other suggestions, but I wasn't. It turns out there was another level of restrictions on reading the bucket imposed by my organization, though i'm not sure what it was.

Sleepyhead answered 20/8, 2018 at 20:16 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.