AWS SQS Asynchronous Queuing Pattern (Request/Response)
Asked Answered
U

4

16

I'm looking for help with an architectural design decision I'm making with a product.

We've got multiple producers (initiated by API Gateway calls into Lambda) that put messages on a SQS queue (the request queue). There can be multiple simultaneous calls, so there would be multiple Lambda instances running in parallel.

Then we have consumers (lets say twenty EC2 instances) who long-poll on the SQS for the message to process them. They take about 30-45 seconds to process a message each.

I would then ideally like to send the response back to the producer that issued the request - and this is the part I'm struggling with with SQS. I would in theory have a separate response queue that the initial Lambda producers would then be consuming, but there doesn't seem to be a way to cherry pick the specific correlated response. That is, each Lambda function might pick up another function's response. I'm looking for something similar to this design pattern: http://soapatterns.org/design_patterns/asynchronous_queuing

The only option that I can see is to create a new SQS Response queue for each Lambda API call, passing in its ARN in the message for the consumers to put the response on, but I can't imagine that's very efficient - especially when there's potentially hundreds of messages a minute? Am I missing something obvious?

I suppose the only other alternative would be setting up a bigger message broker (e.g. RabbitMQ/ApacheMQ) environment, but I'd like to avoid that if possible.

Thanks!

Unpolitic answered 20/6, 2017 at 14:21 Comment(0)
G
3

Yes, you could use RabbitMQ for a more "rpc" queue pattern.

But if you want to stay within AWS, try using something other than SQS for the response.

Instead, you could use S3 for the response. When your producer puts the item into SQS, include in the message an S3 destination for the response. When your consumer completes the tasks, put the response in the desired S3 location.

Then you can check S3 for the response.

Update

You may be able to accomplish an RPC-like message queue using Redis.

https://github.com/ServiceStack/ServiceStack/wiki/Messaging-and-redis

Then, you can use AWS ElastiCache for your Redis cluster. This would completely replace the use of SQS.

Gravitative answered 20/6, 2017 at 14:59 Comment(4)
Hi Matt, thanks for the response. I'm not sure how S3 would work for this scenario - the producer would have to poll to check whether the S3 response file exists? There's no long polling on S3 that I could find, so even if I checked it every second it would seem rather inefficient, especially with potentially hundreds of requests a minute.Unpolitic
Correct, you would have to poll S3. AWS does not have a way to signal back to your Lambda function that the response is ready. So you have 4 options: (a) use SQS, one queue per request, (b) use SQS, shared result queue, (neither of those options are ideal), (c) poll something for the result (S3, database, etc.), or (d) use a non-AWS service.Gravitative
Thanks again, I appreciate your clarity and expertise on this issue. I haven't used Redis before, but your link (especially the Request + Reply MQ Pattern) make it sound like a great option, so I'll investigate that and see it if works in our environment. Thanks again.Unpolitic
To late for the party, but i was thinking that i might find some help in what i want to achieve, @MattLathing
L
12

Create a (Temporary) Response Queue For Every Request

To late for the party, but i was thinking that i might find some help in what i want to achieve, @MattHouser @Zaheer Ally , or give an idea to someone working on a related issue.

I am facing a similar challenge. I have an API that upon request by a client, needs to communicate to multiple external APIs and collect (delayed) results.

Since my PHP API is synchronous, it can only perform these requests sequentially. So, i was thinking to use a request queue, where the producer (API) would send messages. Then, multiple workers would consume these messages, each of them performing one of these external API calls.

To get the results back, the producer would have created a temporary response queue, the name-identifier of which would be embedded in the message sent to workers. Hence, each worker would 'publish' his results on this temporary queue.

In the meantime, the producer would keep polling the temporary queue until he received the expected number of messages. Finally, he would delete the queue and send the collected results back to the client.

Lathing answered 27/10, 2017 at 16:1 Comment(3)
Did it work? IMHO this sounds like a good solution, I mean, if you are dealing with synchronous tasks, seems like a reasonable way to deal with it. One can also try to execute the task before-hand. In my case, the synchronous tasks would be the ranking of some information, that's why I'm tending to rank everything with a cronjob removing the necessity of a SQS message to trigger that taskCausey
As a proof of concept, it seemed to work perfectly. I did not have the chance to test it on a production environment though, as, for business reasons, we decided not to integrate more than one external APIs yet. For us, solutions like cronjobs (or any pre-post-execution method) were out of the question since we depend on real-time data, and even caching needs to be performed with caution. For your application, a cronjob should do the job. You could also consider a serverless solution, such as a Lambda function that would execute with the trigger of your choice (eg. a database INSERT)Lathing
This is the way to go and there is nothing inelegant about creating temporary queues. Just add a mechanism to clean-up and delete them or re-use them.Udell
G
3

Yes, you could use RabbitMQ for a more "rpc" queue pattern.

But if you want to stay within AWS, try using something other than SQS for the response.

Instead, you could use S3 for the response. When your producer puts the item into SQS, include in the message an S3 destination for the response. When your consumer completes the tasks, put the response in the desired S3 location.

Then you can check S3 for the response.

Update

You may be able to accomplish an RPC-like message queue using Redis.

https://github.com/ServiceStack/ServiceStack/wiki/Messaging-and-redis

Then, you can use AWS ElastiCache for your Redis cluster. This would completely replace the use of SQS.

Gravitative answered 20/6, 2017 at 14:59 Comment(4)
Hi Matt, thanks for the response. I'm not sure how S3 would work for this scenario - the producer would have to poll to check whether the S3 response file exists? There's no long polling on S3 that I could find, so even if I checked it every second it would seem rather inefficient, especially with potentially hundreds of requests a minute.Unpolitic
Correct, you would have to poll S3. AWS does not have a way to signal back to your Lambda function that the response is ready. So you have 4 options: (a) use SQS, one queue per request, (b) use SQS, shared result queue, (neither of those options are ideal), (c) poll something for the result (S3, database, etc.), or (d) use a non-AWS service.Gravitative
Thanks again, I appreciate your clarity and expertise on this issue. I haven't used Redis before, but your link (especially the Request + Reply MQ Pattern) make it sound like a great option, so I'll investigate that and see it if works in our environment. Thanks again.Unpolitic
To late for the party, but i was thinking that i might find some help in what i want to achieve, @MattLathing
S
2

Another option would be to use Redis' pub/sub mechanism to asynchronously notify your lambda that the backend work is done. You can use AWS's Elasticache for Redis for an all-AWS-managed solution. Your lambda function would generate a UUID for each request, use that as the channel name to subscribe to, pass it along in the SQS message, and then the backend workers would publish a notification to that channel when the work is done.

I was facing this same problem so I tried it out, and it does work. Whether it's worth the effort over just polling S3 is another question. You have to configure the lambda functions to run inside your VPC, so they can access your Redis. I was going to have to do this anyway since I'd want the workers, in my case also lambda functions, to be able to access my Elasticsearch and RDS. But there are some considerations: most importantly, you need to use a private subnet with a NAT Gateway (or your own NAT Instance), so it can get out to the Internet and AWS managed services (including SQS).

aws diagram

One other thing I just stumbled across is that requests through API Gateway currently cannot take longer than 29 seconds, and this cannot be increased by AWS. You mentioned your jobs take 30 or more seconds, so this could be a showstopper for you using API Gateway and Lambda in this way anyway.

Schmitt answered 4/1, 2018 at 23:54 Comment(0)
C
1

AWS now provides a Java client that supports temporary queues. This is useful for request/response patterns. I can't see a non-Java version.

Cuneiform answered 21/9, 2021 at 7:32 Comment(1)
These are "virtual queues" multiplexed over a single SQS queue. The SQS queue cannot be shared between multiple producers as they could get the response of another. I don't think this could work for Lambda. It seems a very specialized solution.Ammonium

© 2022 - 2024 — McMap. All rights reserved.