AWS Textract StartDocumentAnalysis function not publishing a message to the SNS Topic
Asked Answered
C

5

12

I am working with AWS Textract and I want to analyze a multipage document, therefore I have to use the async options, so I first used startDocumentAnalysisfunction and I got a JobId as the return, But it needs to trigger a function that I have set to trigger when the SNS topic got a message.

These are my serverless file and handler file.

provider:
  name: aws
  runtime: nodejs8.10
  stage: dev
  region: us-east-1
  iamRoleStatements:
    - Effect: "Allow"
      Action:
       - "s3:*"
      Resource: { "Fn::Join": ["", ["arn:aws:s3:::${self:custom.secrets.IMAGE_BUCKET_NAME}", "/*" ] ] }
    - Effect: "Allow"
      Action:
        - "sts:AssumeRole"
        - "SNS:Publish"
        - "lambda:InvokeFunction"
        - "textract:DetectDocumentText"
        - "textract:AnalyzeDocument"
        - "textract:StartDocumentAnalysis"
        - "textract:GetDocumentAnalysis"
      Resource: "*"

custom:
  secrets: ${file(secrets.${opt:stage, self:provider.stage}.yml)}

functions:
  routes:
    handler: src/functions/routes/handler.run
    events:
      - s3:
          bucket: ${self:custom.secrets.IMAGE_BUCKET_NAME}
          event: s3:ObjectCreated:*

  textract:
    handler: src/functions/routes/handler.detectTextAnalysis
    events:
      - sns: "TextractTopic"

resources:
  Resources:
    TextractTopic:
        Type: AWS::SNS::Topic
        Properties:
          DisplayName: "Start Textract API Response"
          TopicName: TextractResponseTopic

Handler.js

module.exports.run = async (event) => {
  const uploadedBucket = event.Records[0].s3.bucket.name;
  const uploadedObjetct = event.Records[0].s3.object.key;

  var params = {
    DocumentLocation: {
      S3Object: {
        Bucket: uploadedBucket,
        Name: uploadedObjetct
      }
    },
    FeatureTypes: [
      "TABLES", 
      "FORMS"
    ],
    NotificationChannel: {
      RoleArn: 'arn:aws:iam::<accont-id>:role/qvalia-ocr-solution-dev-us-east-1-lambdaRole', 
      SNSTopicArn: 'arn:aws:sns:us-east-1:<accont-id>:TextractTopic'
    }
  };

  let textractOutput = await new Promise((resolve, reject) => {
    textract.startDocumentAnalysis(params, function(err, data) {
      if (err) reject(err); 
      else resolve(data);
    });
  });
}

I manually published an sns message to the topic and then it is firing the textract lambda, which currently has this,

module.exports.detectTextAnalysis = async (event) => {
  console.log('SNS Topic isssss Generated');
  console.log(event.Records[0].Sns.Message);
};

What is the mistake that I have and why the textract startDocumentAnalysis is not publishing a message and making it trigger the lambda?

Note: I haven't use the startDocumentTextDetection before using the startTextAnalysis function, though it is not necessary to call it before this.

Carnahan answered 23/6, 2019 at 23:53 Comment(5)
Does qvalia-ocr-solution-dev-us-east-1-lambdaRole have enough permissions to publish over SNS?Formyl
I am also working in amazon textract and the SNS publishing was working about a week ago and now it isn't. I have an application that I didn't change anything in the publishing and now it is broken. The dev must have broken it since it is open preview still.Locule
@Locule I have the same problem and I'm like crazy trying to figure it out what is wrong with this. Thanks for your commentUrsuline
@Locule I noticed that if a use a permit all policy in the role that push to SNS it works. I don't know what permission I'm forgetting to make it workUrsuline
@RubenJGarcia I got mine working because of the IAM role I'm using was not allowing Textract specifically in the Trusted Relationships. { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Service": [ "lambda.amazonaws.com", "textract.amazonaws.com" ] }, "Action": "sts:AssumeRole" } ] }Locule
L
10

Make sure you have in your Trusted Relationships of the role you are using:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": [
          "lambda.amazonaws.com",
          "textract.amazonaws.com"
        ]
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
Locule answered 11/7, 2019 at 21:38 Comment(1)
This answer and in combination with ensuring that I have both the "sts:AssumeRole" and "sns:Publish" permissions is what worked for me. Thanks!Barehanded
B
5

The SNS Topic name must be AmazonTextract

At the end your arn should look this:

arn:aws:sns:us-east-2:111111111111:AmazonTextract
Barger answered 29/4, 2021 at 3:20 Comment(1)
Yes, if you use the aws-managed policy "AmazonTextractServiceRole", it is resource-restricted to "arn:aws:sns:*:*:AmazonTextract*", meaning that your SNS topic name must at least start with "AmazonTextract"Sarcoma
U
0

If you have your bucket encrypted you should grant kms permissions, otherwise it won't work

Ursuline answered 4/7, 2019 at 13:19 Comment(1)
can you please specify which KMS actions are you referring to?Stine
B
0

I was able got this working directly via Serverless Framework by adding a Lambda execution resource to my serverless.yml file:

resources:
  Resources:
    IamRoleLambdaExecution:
      Type: AWS::IAM::Role
      Properties:
        AssumeRolePolicyDocument:
          Version: "2012-10-17"
          Statement:
            - Effect: Allow
              Principal:
                Service:
                  - lambda.amazonaws.com
                  - textract.amazonaws.com
              Action: sts:AssumeRole

And then I just used the same role generated by Serverless (for the lambda function) as the notification channel role parameter when starting the Textract document analysis:

Thanks to this this post for pointing me in the right direction!

Barehanded answered 28/10, 2020 at 20:56 Comment(0)
I
0

For anyone using the CDK in TypeScript, you will need to add Lambda as a ServicePrincipal as usual to the Lambda Execution Role. Next, access the assumeRolePolicy of the execution role and call the addStatements method.

The basic execution role without any additional statement (add those later)

  this.executionRole = new iam.Role(this, 'ExecutionRole', {
    assumedBy: new ServicePrincipal('lambda.amazonaws.com'),
  });

Next, add Textract as an additional ServicePrincipal

  this.executionRole.assumeRolePolicy?.addStatements(
    new PolicyStatement({
      principals: [
        new ServicePrincipal('textract.amazonaws.com'),
      ],
      actions: ['sts:AssumeRole']
    })
  );

Also, ensure the execution role has full permissions on the target SNS topic (note the topic is created already and accessed via fromTopicArn method)

 const stmtSNSOps = new PolicyStatement({
    effect: iam.Effect.ALLOW,
    actions: [
      "SNS:*"
    ],
    resources: [
      this.textractJobStatusTopic.topicArn
    ]
  });

Add the policy statement to a global policy (within the active stack)

 this.standardPolicy = new iam.Policy(this, 'Policy', {
    statements: [
      ...
      stmtSNSOps, 
      ...
    ]
  });

Finally, attach the policy to the execution role

  this.executionRole.attachInlinePolicy(this.standardPolicy);
Indecent answered 9/11, 2020 at 22:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.