When i try to parse pdf file accessed via amazon s3, it gives me an error, Request has unsupported document format.
i am using Amazon textract with boto3. When i try to parse pdf file accessed via amazon s3, it gives me an error, Request has unsupported do cument format. I am fairly new to this, in the documentation of textract it is mentioned that pdf files are indeed supported.
This is the code i am using.
import boto3
textractClient = boto3.client('textract',region_name='us-east-1')
response = textractClient.detect_document_text(
Document={'S3Object': {'Bucket': 'bucketName', 'Name': 'filename.pdf'}})
blocks = response['Blocks']
This gives me the error,Request has unsupported document format.