How does Windows Azure Service Bus Queues Duplicate Detection work?
Asked Answered
W

4

15

I know that you can set duplicate detection to work over a time period with an azure service bus queue. However, does anyone know whether this works based on the objects in the queue?

So if I have an object with an id of "SO_1" which gets put on the queue and is subsequently consumed, is the duplicate detection still valid?

What I think I'm asking is - is it the timeframe and the object, or just the timeframe that make the queue decide what is a duplicate?

Wreath answered 31/5, 2013 at 11:48 Comment(0)
G
18

http://blog.iquestgroup.com/en/windows-azure-service-bus-duplicate-detection/#.UaiXrd7frIU

When we activate duplication, the Windows Azure Service Bus will start to store a history of our messages. This period of time can be configured to range from only a few minutes to days. If a duplicate message is sent to the Service Bus, the service will automatically ignore the message.

Gabion answered 31/5, 2013 at 12:25 Comment(3)
Great stuff thanks - that last article is helpful but it still doesn't say whether or not the duplicate detection checks whether or not the item is actually on the queue before deciding whether or not to dump the item. I'm guessing not - that it just keeps a list of duplicates and checks against them and ignores the contents of the queue. I'll keep this question open until I can find a definitive answer.Wreath
It is quite clear from the Windows Azure Service Bus will start to store a history of our messages. This period of time can be configured to range from only a few minutes to days.Hachure
The deduplication does not care what is on the queue at the moment. A) If MessageId 1 has passed through during the time frame, the same MessageId will not be enqueued again. B) If MessageId 1 has been enqueued long before, but not dequeued within the time frame, a second MessageId 1 will be enqueued again (i.e. duplicated).Veneering
R
6

Posting this to clarify on a couple of misconceptions in the responses found above,

  1. Enabling duplicate detection helps keep track of the application-controlled MessageId of all messages sent into a queue or topic during a specified time window. If any new message is sent carrying a MessageId that has already been logged during the time window, the message is reported as accepted (the send operation succeeds), but the newly sent message is instantly ignored and dropped. No other parts of the message other than the MessageId are considered. (the blog referenced in one of the responses says the message content cannot be duplicate which is not correct).

  2. Default value of duplicate detection time history now is 30 seconds, the value can range between 20 seconds and 7 days.

Refer this documentation for more details.

Rickeyricki answered 3/8, 2018 at 13:6 Comment(1)
seems default value now is 10 minutes.Very
L
3

This actually just bit me, the default seems to be to have it enabled and the default time is 10 minutes. The "key" is the MessageId. In our case, in most scenarios duplicate detection is fine, but in some it was bad news (especially with the 10 minute range). To get around this, we introduced a "breaker":

// For this message, we need to prevent dups from being detected
msg.MessageId = messageId + "_" + DateTime.Now.ToString("u");

If you just want to prevent "spamming" you might consider setting the duplicate detection window to the minimum (20 seconds). (Personally, I would love to see a threshold as low as 5 seconds).

The current ranges allowed are 20 seconds to 7 days.

Lynnett answered 31/10, 2013 at 4:6 Comment(1)
FWIW - In high performance situations, the "u" format only includes seconds. I just fixed a bug where this was not granular enough. My fix was to use: DateTime.Now.ToString("yyyy-M-ddThh:mm:ss.ff") instead. This is the same as "u" but adds sub-seconds to the 1/100th.Lynnett
H
0

You will have to create message id based on object e.g. hash of object and enable duplicate message detection in topic/queue.

Azure Service Bus duplicate detection points to keep in mind:

• Duplicate is identified based on SessionId(if present), PartitionKey(if present), and MessageId in a time window

• Duplicate detection time window:

 o  20 secs to 7 days (default : 10 mins)
 o  Larger window can impact throughput due to matching, better to keep as small window as possible

• Duplicate detection can be enabled only while creating topic/queue, window can be update at any point of time

• Duplicate messages will be ignored/dropped

ref: https://learn.microsoft.com/en-us/azure/service-bus-messaging/duplicate-detection

Hylton answered 8/12, 2022 at 6:59 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.