AWS Lambda connection to SQS timed out
Asked Answered
A

4

6

I am working on an task which involves Lambda function running inside VPC.

This function is supposed to push messages to SQS and lambda execution role has policies : AWSLambdaSQSQueueExecutionRole and AWSLambdaVPCAccessExecutionRole added.

Lambda functions :

# Create SQS client
sqs = boto3.client('sqs')

queue_url = 'https://sqs.ap-east-1a.amazonaws.com/073x08xx43xx37/xyz-queue'

# Send message to SQS queue
response = sqs.send_message(
    QueueUrl=queue_url,
    DelaySeconds=10,
    MessageAttributes={
        'Title': {
            'DataType': 'String',
            'StringValue': 'Tes1'
        },
        'Author': {
            'DataType': 'String',
            'StringValue': 'Test2'
        },
        'WeeksOn': {
            'DataType': 'Number',
            'StringValue': '1'
        }
    },
    MessageBody=(
        'Testing'
     )
)

print(response['MessageId'])

On testing the execution result is as :

{
  "errorMessage": "2020-07-24T12:12:15.924Z f8e794fc-59ba-43bd-8fee-57f417fa50c9 Task timed out after 3.00 seconds"
}

I increased the Timeout from Basic Settings to 5 seconds & 10 seconds as well. But the error kept coming.

If anyone has faced similar issue in past or is having an idea how to get this resolved, Please help me out.

Thanks you in advance.

Alate answered 24/7, 2020 at 12:45 Comment(2)
Do you have either a VPC endpoint or a NAT that would allow your Lambda to connect to SQS?Balliol
Yeah. I have an NAT security group already added.Alate
D
16

When an AWS Lambda function is configured to use an Amazon VPC, it connects to a nominated subnet of the VPC. This allows the Lambda function to communicate with other resources inside the VPC. However, it cannot communicate with the Internet. This is a problem because the Amazon SQS public endpoint lives on the Internet and the function is timing-out because it is unable to reach the Internet.

Thus, you have 3 options:

Option 1: Do not connect to a VPC

If your Lambda function does not need to communicate with a resource in the VPC (such as the simple function you have provided above), simply do not connect it to the VPC. When a Lambda function is not connected to a VPC, it can communicate with the Internet and the Amazon SQS public endpoint.

Option 2: Use a VPC Endpoint

A VPC Endpoint provides a means of accessing an AWS service without going via the Internet. You would configure a VPC endpoint for Amazon SQS. Then, when the Lambda function wishes to connect with the SQS queue, it can access SQS via the endpoint rather than via the Internet. This is normally a good option if the Lambda function needs to communicate with other resources in the VPC.

Option 3: Use a NAT Gateway

If the Lambda function is configured to use a private subnet, it will be able to access the Internet if a NAT Gateway has been provisioned in a public subnet and the Route Table for the private subnet points to the NAT Gateway. This involves extra expense and is only worthwhile if there is an additional need for a NAT Gateway.

Durware answered 24/7, 2020 at 13:1 Comment(4)
Thank you for explaining! This is really informative. I am using a NAT Gateway (Option 3). I have added VPC, a Public Subnet and a NAT security group ( where I have configured my Inbound and Outbound rules ). I guess this should allow Lambda to connect to SQS ? Unfortunately, I am still getting timeout error. Please do let me know if I am missing something.Alate
Make sure that the Route Table associated with the private subnets (especially the one with the Lambda function) is configured to redirect traffic with a destination of 0.0.0.0/0 to the NAT Gateway. The security group on the NAT Gateway should allow any Inbound traffic from the private subnets (or the whole VPC), and allow all Outbound traffic.Durware
more on option 2 can be found here docs.aws.amazon.com/AWSSimpleQueueService/latest/…Regolith
Thank you. Following your tips I found this really helpful tutorial for usage of AWS Lambda + SQS and RDS: lisenet.com/2016/…Murphey
M
9

If you're using the boto3 python library in a lambda in a VPC, and it's failing to connect to an sqs queue through a vpc endpoint, you must set the endpoint_url when creating the sqs client. Issue 1900 describes the background behind this.

The solution looks like this (for an sqs vpc endpoint in us-east-1):

sqs_client = boto3.client('sqs',
    endpoint_url='https://sqs.us-east-1.amazonaws.com')

Then call send_message or send_message_batch as normal.

Mathison answered 29/10, 2021 at 15:44 Comment(1)
I spent a day on this and asked this question before finding your answer. Thanks!Myra
S
1

You need to place your lambda inside your VPC then set up a VPC endpoint for SQS or NAT gateway, When you add your lambda function to a subnet, make sure you ONLY add it to the private subnets, otherwise nothing will work.

Reference

https://docs.aws.amazon.com/lambda/latest/dg/vpc.html

https://aws.amazon.com/premiumsupport/knowledge-center/internet-access-lambda-function/

Stereometry answered 24/7, 2020 at 13:2 Comment(0)
J
1

I am pretty convinced that you cannot call an SQS queue from within a VPC using Lambda using an SQS endpoint. I'd consider it a bug, but maybe the Lambda team did this for a reason. In any case, You will get a message timeout. I cooked up a simple test Lambda

import json
import boto3
import socket

def lambda_handler(event, context):
    print('lambda-test SQS...')
    sqsDomain='sqs.us-west-2.amazonaws.com'
    
    addr1 = socket.gethostbyname(sqsDomain)
    print('%s=%s' %(sqsDomain, addr1))
     
    print('Creating sqs client...')
    sqs = boto3.client('sqs')
    
    print('Sending Test Message...')
    response = sqs.send_message(
            QueueUrl='https://sqs.us-west-2.amazonaws.com/1234567890/testq.fifo',
            MessageBody='Test SQS Lambda!',
            MessageGroupId='test')
            
    print('SQS send response: %s' % response)

    return {
        'statusCode': 200,
        'body': json.dumps(response)
    }

I created a VPC, subnet, etc per - Configuring a Lambda function to access resources in a VPC. The EC2 instance in this example has no problem invoking SQS through the private endpoint from the CLI per this tutorial.

If I drop my simple Lambda above into the same VPC and subnet, with SQS publishing permissions etc. and invoke the test function it will properly resolve the IP address of the SQS endpoint within the subnet, but the call will timeout (making sure your Lambda timeout is more than 60 seconds to let boto fail). Enabling boto debug logging further confirms that the IP is resolved correctly and the HTTP request to SQS times out.

I didn't try this with a non-FIFO queue but as the HTTP call is failing on connection request this shouldn't matter. It's got to be a routing issue from the Lambda as the EC2 in the same subnet works.

I modified my simple Lambda and added an SNS endpoint and did the same test which worked. The issue issue appears to be specific to SQS best I can tell.

import json
import boto3
import socket

def testSqs():
    print('lambda-test SQS...')
    sqsDomain='sqs.us-west-2.amazonaws.com'
    
    addr1 = socket.gethostbyname(sqsDomain)
    print('%s=%s' %(sqsDomain, addr1))
    
    print('Creating sqs client...')
    sqs = boto3.client('sqs')
    
    print('Sending Test Message...')
    response = sqs.send_message(
            QueueUrl='https://sqs.us-west-2.amazonaws.com/1234567890/testq.fifo',
            MessageBody='Test SQS Lambda!',
            MessageGroupId='test')
            
    print('SQS send response: %s' % response)

    return {
        'statusCode': 200,
        'body': json.dumps(response)
    }
    

def testSns():
    print('lambda-test SNS...')

    print('Creating sns client...')
    sns = boto3.client('sns')
    
    print('Sending Test Message...')
    response = sns.publish(
            TopicArn='arn:aws:sns:us-west-2:1234567890:lambda-test',
            Message='Test SQS Lambda!'
            )
            
    print('SNS send response: %s' % response)

    return {
        'statusCode': 200,
        'body': json.dumps(response)
    }
    

def lambda_handler(event, context):
    #return testSqs()
    return testSns()

I think your only options are NAT (per John above), bounce your calls off a local EC2 (NAT will be simpler, cheaper, and more reliable), or use a Lambda proxy outside the VPC. Which someone else suggested in a similar post. You could also subscribe an SQS queue to an SNS topic (I prototyped this and it works) and route it out that way too, but that just seems silly unless you absolutely have to have SQS for some obscure reason.

I switched to SNS. I was just hoping to get some more experience with SQS. Hopefully somebody can prove me wrong, but I call it a bug.

Jemena answered 7/4, 2021 at 13:46 Comment(1)
There is a bug - but it's not in AWS, it's actually in the boto3 library! There is an open issue at github.com/boto/boto3/issues/1900. In order to connect to the queue through the vpc endpoint, you have to tell boto3 to use it when you instantiate the sqs client - see my answer below.Mathison

© 2022 - 2024 — McMap. All rights reserved.