AWS: multiple instances reading SQS
Asked Answered
S

3

16

Simple question: I want to run an autoscale group on Amazon, which fires up multiple instance which processes the messages from a SQS queue. But how do I know that the instances aren't processing the same messages?

I can delete a message from the queue when it's processed. But if it's not deleted yet and still being processed by an instance, another instance CAN download that same message and processing it also, to my opinion.

Supercharger answered 12/5, 2015 at 10:7 Comment(0)
C
35

Aside from the fairly remote possibility of SQS incorrectly delivering the same message more than once (which you still need to account for, even though it is unlikely), I suspect your question stems from a lack of familiarity with SQS's concept of "visibility timeout."

Immediately after the component receives the message, the message is still in the queue. However, you don't want other components in the system receiving and processing the message again. Therefore, Amazon SQS blocks them with a visibility timeout, which is a period of time during which Amazon SQS prevents other consuming components from receiving and processing that message.

http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/AboutVT.html

This is what keeps multiple queue runners from seeing the same message. Once the visibility timeout expires, the message will be delivered again to a queue consumer, unless you delete it, or it exceeds the maximum configured number of deliveries (at which point it's deleted or goes into a separate dead letter queue if you have configured one). If a job will take longer than the configured visibility timeout, your consumer can also send a request to SQS to change the visibility timeout for that individual message.


Update:

Since this answer was originally written, SQS has introduced FIFO Queues in some of the AWS regions. These operate with the same logic described above, but with guaranteed in-order delivery and additional safeguards to guarantee that occasional duplicate message delivery cannot occur.

FIFO (First-In-First-Out) queues are designed to enhance messaging between applications when the order of operations and events is critical, or where duplicates can't be tolerated. FIFO queues also provide exactly-once processing but are limited to 300 transactions per second (TPS).

http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/FIFO-queues.html

Switching an application to a FIFO queue does require some code changes, and requires that a new queue be created -- existing queues can't be changed over to FIFO.

Cambist answered 12/5, 2015 at 12:13 Comment(4)
Clear answer! Thank you! So when I download and proces one message at a time, there should be no problem :) Else I can change the visibility timeout... Good to know, thanksSupercharger
Hi, thanks for the answer. Quick clarification needed, when FIFO Queues say exactly-once-processing, does it means consumers reading those SQS process these messages only once or messages are published into Queue only once. For eg, if there are multiple processes reading messages from the Queue, is each message guaranteed to be read by a process only once.Thorax
@Thorax it is both. SQS FIFO queues ensure that a message is delivered only once, in order, as well as providing mechanisms to ensure that a given message is enqueued only once within a 5 minute interval -- helping avoid circumstances such as a network connection being severed at precicely the wrong instant while the success message is on its way back to the producer, resulting in the message being enqueued but the producer being unsure and resending. In this case, the message is not enqueued again.Cambist
Thanks Michael but I realized I cannot configure FIFO Queues as an event listener to s3 bucket.Thorax
F
3

You can receive duplicate messages, but only "on rare occasions". And so you should aim for idempotency.

Fielder answered 12/5, 2015 at 11:52 Comment(0)
T
0

An instance can receive duplicate messages only once the SQS visibility time out has expired. By default the visibility timeout is 30 seconds. So you have 30 seconds to make sure that your processing is done, else other instances may welcome new messages.

See AWS SQS Timeout for timeout details.

Tressietressure answered 23/8, 2022 at 3:36 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.