Automatic AWS DynamoDB to S3 export failing with "role/DataPipelineDefaultRole is invalid"
Asked Answered
K

4

10

Precisely following the step-by-step instructions on this page I am trying to export contents of one of my DynamoDB tables to an S3 bucket. I create a pipeline exactly as instructed but it fails to run. It seems that it has trouble identifying/running an EC2 resource to do the export. When I access EMR through AWS Console, I see entries like this:

Cluster: df-0..._@EmrClusterForBackup_2015-03-06T00:33:04Terminated with errorsEMR service role arn:aws:iam::...:role/DataPipelineDefaultRole is invalid

Why am I getting this message? Do I need to set up/configure something else for the pipeline to run?

UPDATE: UnderIAM->Roles in AWS console I am seeing this for DataPipelineDefaultResourceRole:

{
    "Version": "2012-10-17",
    "Statement": [{
    "Effect": "Allow",
    "Action": [
        "s3:List*",
        "s3:Put*",
        "s3:Get*",
        "s3:DeleteObject",
        "dynamodb:DescribeTable",
        "dynamodb:Scan",
        "dynamodb:Query",
        "dynamodb:GetItem",
        "dynamodb:BatchGetItem",
        "dynamodb:UpdateTable",
        "rds:DescribeDBInstances",
        "rds:DescribeDBSecurityGroups",
        "redshift:DescribeClusters",
        "redshift:DescribeClusterSecurityGroups",
        "cloudwatch:PutMetricData",
        "datapipeline:PollForTask",
        "datapipeline:ReportTaskProgress",
        "datapipeline:SetTaskStatus",
        "datapipeline:PollForTask",
        "datapipeline:ReportTaskRunnerHeartbeat"
    ],
    "Resource": ["*"]
    }]
}

And this for DataPipelineDefaultRole:

{
    "Version": "2012-10-17",
    "Statement": [{
    "Effect": "Allow",
    "Action": [
        "s3:List*",
        "s3:Put*",
        "s3:Get*",
        "s3:DeleteObject",
        "dynamodb:DescribeTable",
        "dynamodb:Scan",
        "dynamodb:Query",
        "dynamodb:GetItem",
        "dynamodb:BatchGetItem",
        "dynamodb:UpdateTable",
        "ec2:DescribeInstances",
        "ec2:DescribeSecurityGroups",
        "ec2:RunInstances",
        "ec2:CreateTags",
        "ec2:StartInstances",
        "ec2:StopInstances",
        "ec2:TerminateInstances",
        "elasticmapreduce:*",
        "rds:DescribeDBInstances",
        "rds:DescribeDBSecurityGroups",
        "redshift:DescribeClusters",
        "redshift:DescribeClusterSecurityGroups",
        "sns:GetTopicAttributes",
        "sns:ListTopics",
        "sns:Publish",
        "sns:Subscribe",
        "sns:Unsubscribe",
        "iam:PassRole",
        "iam:ListRolePolicies",
        "iam:GetRole",
        "iam:GetRolePolicy",
        "iam:ListInstanceProfiles",
        "cloudwatch:*",
        "datapipeline:DescribeObjects",
        "datapipeline:EvaluateExpression"
    ],
    "Resource": ["*"]
    }]
}

Do these need to be modified somehow?

Kellene answered 6/3, 2015 at 20:21 Comment(3)
Can you add what your IAM policies are?Blackwell
@MikeKobit Could you explain to me how I can get to these policies from the AWS Web Console? thxKellene
Check the account roles and their IAM policies. See Setting Up IAM RolesBlackwell
S
3

I ran into the same error.

In IAM, attach the AWSDataPipelineRole managed policy to DataPipelineDefaultRole

I also had to update the Trust Relationship to the following (needed ec2 which is not in the documentation):

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": [
          "ec2.amazonaws.com",
          "elasticmapreduce.amazonaws.com",
          "datapipeline.amazonaws.com"
        ]
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
Stylet answered 16/11, 2016 at 4:38 Comment(0)
S
2

There is a similar question in AWS forum and it seems it is related to an issue with managed policies

https://forums.aws.amazon.com/message.jspa?messageID=606756

In that question, they recommend using specific inline policies for both access and trust policies to define those roles changing some permissions. Oddly enough, the specific inline policies can be found at

http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-iam-roles.html

Squad answered 8/11, 2015 at 13:7 Comment(0)
P
0

I had the same issue. The managed policies were correct in my case, but I had to update the trust relationships for both the DataPipelineDefaultRole and DataPipelineDefaultResourceRole roles using the documentation Gonfva linked to above as they were out of date.

Puffball answered 10/4, 2017 at 7:10 Comment(0)
C
0

Issue might be with the IAM role.

It might help, although not in all cases. I had the same problem when I was trying to export dynamodb data to S3 using data pipeline. Issue is with the Resource Role - DataPipelineDefaultResourceRole and, Role - DataPipelineDefaultRole roles used in Data Pipeline

Solution Go to IAM -> Roles -> DataPipelineDefaultResourceRole and attach AmazonDynamoDBFullAccess and AmazonS3FullAccess policies to this role. Do the same for DataPipelineDefaultRole.

Please note: You should give restricted DynamoDB and S3 access based upon your use case.

Try running your data pipeline now. It will be in Running State.

Chufa answered 23/5, 2021 at 14:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.