SQS Timeout from Lambda in VPC Using VPC Endpoint
Asked Answered
A

1

2

This is basically the same issue as in this question, but the answers there didn't get me to a solution.

My configuration is: 1 VPC, 1 subnet, 1 security group. My Lambda runs in the VPC/subnet/security group and tries to add a message to an SQS queue, but gets a timeout. I've double-checked the permissions granted to the lambda, the policy on the VPC Endpoint, the policy on the SQS queue, opened the rules on the Security Group, ensured the Network ACLs are open.

I successfully went through this tutorial, which sets up VPC/etc+EC2 with cloudformation, then demonstrates sending a message to SQS from EC2.

To reproduce my problem, I started with the cloudformation from that tutorial and added the following to it:

  • the VPC Endpoint (rather than creating it through console like in the tutorial)
  • a Lambda (plus IAM role+policy) in the same VPC that tries to send a message to the SQS queue

The resulting cloudformation template is below.

I can reproduce the problem like this:

  1. Create the cloudformation template (see below) (note that I had to make one small change to the template in the tutorial to get it to work in us-west-2).
  2. SSH to the EC2 and run the command to send an SQS message (see step 5 from the tutorial). This succeeds.
  3. In the console, go to the Lambda, paste the URL of the SQS queue into the code, deploy, and run the lambda. It times out.
  4. In the console, edit the Lambda configuration to set VPC=None, then rerun the lambda. It succeeds.

So the SQS queue is accessible by the lambda outside the VPC, and by EC2 inside the VPC/subnet/sg, but not the lambda inside the VPC/subnet/sg.

Any idea what could be missing?

Cloudformation (from tutorial + my additions):

# Copied from this tutorial: https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-sending-messages-from-vpc.html
AWSTemplateFormatVersion: 2010-09-09
Description: CloudFormation Template for SQS VPC Endpoints Tutorial
Parameters:
  KeyName:
    Description: Name of an existing EC2 KeyPair to enable SSH access to the instance
    Type: 'AWS::EC2::KeyPair::KeyName'
    ConstraintDescription: must be the name of an existing EC2 KeyPair.
  SSHLocation:
    Description: The IP address range that can be used to SSH to the EC2 instance
    Type: String
    MinLength: '9'
    MaxLength: '18'
    Default: 0.0.0.0/0
    AllowedPattern: '(\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3})/(\d{1,2})'
    ConstraintDescription: must be a valid IP CIDR range of the form x.x.x.x/x.
Conditions:
  IsT3Supported: !Equals [!Ref 'AWS::Region', eu-north-1]
Mappings:
  RegionMap:
    us-east-1:
      AMI: ami-428aa838
    us-east-2:
      AMI: ami-710e2414
    us-west-1:
      AMI: ami-4a787a2a
    us-west-2:
      AMI: ami-7f43f307
    ap-northeast-1:
      AMI: ami-c2680fa4
    ap-northeast-2:
      AMI: ami-3e04a450
    ap-southeast-1:
      AMI: ami-4f89f533
    ap-southeast-2:
      AMI: ami-38708c5a
    ap-south-1:
      AMI: ami-3b2f7954
    ca-central-1:
      AMI: ami-7549cc11
    eu-central-1:
      AMI: ami-1b2bb774
    eu-west-1:
      AMI: ami-db1688a2
    eu-west-2:
      AMI: ami-6d263d09
    eu-north-1:
      AMI: ami-87fe70f9
    eu-west-3:
      AMI: ami-5ce55321
    sa-east-1:
      AMI: ami-f1337e9d
Resources:
  VPC:
    Type: 'AWS::EC2::VPC'
    Properties:
      CidrBlock: 10.0.0.0/16
      EnableDnsSupport: 'true'
      EnableDnsHostnames: 'true'
      Tags:
        - Key: Name
          Value: SQS-VPCE-Tutorial-VPC
  Subnet:
    Type: 'AWS::EC2::Subnet'
    Properties:
      VpcId: !Ref VPC
      # I had to add (uncomment) this line to avoid using us-west-2d, which doesn't support the instance type
      # AvailabilityZone: us-west-2a
      CidrBlock: 10.0.0.0/24
      Tags:
        - Key: Name
          Value: SQS-VPCE-Tutorial-Subnet
  InternetGateway:
    Type: 'AWS::EC2::InternetGateway'
    Properties:
      Tags:
        - Key: Name
          Value: SQS-VPCE-Tutorial-InternetGateway
  VPCGatewayAttachment:
    Type: 'AWS::EC2::VPCGatewayAttachment'
    Properties:
      VpcId: !Ref VPC
      InternetGatewayId: !Ref InternetGateway
  RouteTable:
    Type: 'AWS::EC2::RouteTable'
    Properties:
      VpcId: !Ref VPC
      Tags:
        - Key: Name
          Value: SQS-VPCE-Tutorial-RouteTable
  SubnetRouteTableAssociation:
    Type: 'AWS::EC2::SubnetRouteTableAssociation'
    Properties:
      RouteTableId: !Ref RouteTable
      SubnetId: !Ref Subnet
  InternetGatewayRoute:
    Type: 'AWS::EC2::Route'
    Properties:
      RouteTableId: !Ref RouteTable
      GatewayId: !Ref InternetGateway
      DestinationCidrBlock: 0.0.0.0/0
  SecurityGroup:
    Type: 'AWS::EC2::SecurityGroup'
    Properties:
      GroupName: SQS VPCE Tutorial Security Group
      GroupDescription: Security group for SQS VPC endpoint tutorial
      VpcId: !Ref VPC
      SecurityGroupIngress:
        - IpProtocol: '-1'
          CidrIp: 10.0.0.0/16
        - IpProtocol: tcp
          FromPort: '22'
          ToPort: '22'
          CidrIp: !Ref SSHLocation
      SecurityGroupEgress:
        - IpProtocol: '-1'
          CidrIp: 10.0.0.0/16
      Tags:
        - Key: Name
          Value: SQS-VPCE-Tutorial-SecurityGroup
  EC2Instance:
    Type: 'AWS::EC2::Instance'
    Properties:
      KeyName: !Ref KeyName
      InstanceType: !If [IsT3Supported, t3.micro, t2.micro]
      ImageId: !FindInMap
        - RegionMap
        - !Ref 'AWS::Region'
        - AMI
      NetworkInterfaces:
        - AssociatePublicIpAddress: 'true'
          DeviceIndex: '0'
          GroupSet:
            - !Ref SecurityGroup
          SubnetId: !Ref Subnet
      IamInstanceProfile: !Ref EC2InstanceProfile
      Tags:
        - Key: Name
          Value: SQS-VPCE-Tutorial-EC2Instance
  EC2InstanceProfile:
    Type: 'AWS::IAM::InstanceProfile'
    Properties:
      Roles:
        - !Ref EC2InstanceRole
      InstanceProfileName: !Sub 'EC2InstanceProfile-${AWS::Region}'
  EC2InstanceRole:
    Type: 'AWS::IAM::Role'
    Properties:
      RoleName: !Sub 'SQS-VPCE-Tutorial-EC2InstanceRole-${AWS::Region}'
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Principal:
              Service: ec2.amazonaws.com
            Action: 'sts:AssumeRole'
      ManagedPolicyArns:
        - 'arn:aws:iam::aws:policy/AmazonSQSFullAccess'
  CFQueue:
    Type: 'AWS::SQS::Queue'
    Properties:
      VisibilityTimeout: 60

  # Stuff I added starting here:
  VPCEndpointForSQS:
    Type: 'AWS::EC2::VPCEndpoint'
    Properties:
      VpcEndpointType: 'Interface'
      PolicyDocument:
        Statement:
        - Action: '*'
          Effect: Allow
          Resource: '*'
          Principal: '*'
      ServiceName: !Sub 'com.amazonaws.${AWS::Region}.sqs'
      VpcId: !Ref VPC
      SubnetIds:
      - !Ref Subnet
      PrivateDnsEnabled: true
      SecurityGroupIds:
      - !Ref SecurityGroup
  LambdaRole:
    Type: 'AWS::IAM::Role'
    Properties:
      RoleName: !Sub 'SQS-VPCE-Tutorial-LambdaRole-${AWS::Region}'
      ManagedPolicyArns:
        - 'arn:aws:iam::aws:policy/AmazonSQSFullAccess'
        - 'arn:aws:iam::aws:policy/service-role/AWSLambdaVPCAccessExecutionRole'
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Principal:
              Service:
                - lambda.amazonaws.com
            Action:
              - 'sts:AssumeRole'
  LambdaPolicy:
    Type: 'AWS::IAM::Policy'
    Properties:
      PolicyName: !Sub 'SQS-VPCE-Tutorial-LambdaPolicy-${AWS::Region}'
      PolicyDocument:
        Version: '2012-10-17'
        Statement:
        - Effect: Allow
          Action:
          - 'logs:CreateLogGroup'
          Resource: '*'
        - Effect: Allow
          Action:
          - logs:CreateLogStream
          - logs:PutLogEvents
          Resource: '*'
      Roles:
      - !Ref LambdaRole
  LambdaFunction:
    Type: 'AWS::Lambda::Function'
    Properties:
      FunctionName: 'SQS-VPCE-Tutorial-Lambda'
      Role: !GetAtt LambdaRole.Arn
      Runtime: 'python3.9'
      Handler: 'index.lambda_handler'
      Timeout: 20
      VpcConfig:
        SecurityGroupIds:
        - !Ref SecurityGroup
        SubnetIds:    
        - !Ref Subnet
      Code:
        ZipFile: |
          import json
          import boto3
          from botocore.exceptions import ClientError

          sqs = boto3.resource('sqs')
          queue = sqs.Queue('<INSERT SQS QUEUE URL HERE>')

          def lambda_handler(event, context):
              print("before")
              queue.send_message(MessageBody='Hello from Amazon SQS.')
              print("after")
Arrowood answered 18/1, 2022 at 0:54 Comment(0)
A
1

Of course, as soon as I posted this, I found this answer, that there is a bug in boto3 that prevents it from using VPC Endpoint for SQS by default. I tried the solution there and is solved the problem!

Arrowood answered 18/1, 2022 at 2:6 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.