How do you maintain idempotency with Azure EventGrid webhooks?
Asked Answered
A

2

7

I have configured an EventGrid subscription to initiate a web hook call for events in a resource group when a resource is created.

The web hook call is successfully handled, and I return a 200 OK. To maintain idempotency, I store all events that have occurred in a webhook_events table with the id of the event. Any new events are checked to see if they exist in that table by their id.

Azure EventGrid attempts to remove the event from the retry queue after returning a 200 OK. No matter how quickly I respond with a 200 OK, EventGrid reliably retries sending.

I am receiving the same event multiple times (as I said, EventGrid always retries, as it cannot remove the event from the retry queue fast enough). This however is not the focus of my question; rather, the issue exists in the fact that each of these retries presents me with a different id for the event. This means that I cannot logically determine the uniqueness of an event, and my application code is not being executed in an idempotent fashion.

How can I maintain idempotency between my application and Azure despite there being no unique identifier between event retries?

Arc answered 19/12, 2018 at 13:6 Comment(2)
This is likely not the EventGrid but the resource provider (service). You might want to raise the issue with Microsoft Group owning that provider.Coster
I thought that may have been the case. I'll reach out to the team working on Azure SQL Database and see if they have a problem with their implementation.Arc
L
3

The id field is in fact unique per event and kept identical between retries & therefore can be used for dedupe.

What you're running into is a specific issue with some events generated by Azure Resource Manager (ARM). Specifically, the two events you are seeing are in fact distinct events, not duplicates, generated by ARM at different stages of the creative flow for some resource types.

ARM is acting as the API front door to the various Azure services and emits a set of events for that are generalized and often to get the details of what has occurred, you need to look in the data payload. For example, ARM will emit a success event for each 2xx status code it receives from an Azure service, so a 202 accepted and a 201 created can result in two events being emitted and the only way to see the difference would be in the data payload.

This is a known pain point, and we are working to emit more high-fidelity events that will be clearer and easier to react to in these scenarios. The ideal state will be a change-feed of sorts for the Azure control plane.

Lepine answered 18/3, 2020 at 2:15 Comment(0)
C
5

It's the way EventGrid is implemented if you look at the documentation

If the endpoint responds within 3 minutes, Event Grid will attempt to remove the event from the retry queue on a best effort basis but duplicates may still be received.

you can use back-end code to clean up logs and stored data, using event and message IDs to identify duplicates.

Cupreous answered 8/8, 2019 at 5:16 Comment(2)
Why are you linking to a Polish site?Crimmer
Sajeetharan, I am not able to find a messageId field in the payload from EventGrid. The id field is different between retries of the same event. I am still trying to find a way to enforce idempotency with EventGrid webhooks.Arc
L
3

The id field is in fact unique per event and kept identical between retries & therefore can be used for dedupe.

What you're running into is a specific issue with some events generated by Azure Resource Manager (ARM). Specifically, the two events you are seeing are in fact distinct events, not duplicates, generated by ARM at different stages of the creative flow for some resource types.

ARM is acting as the API front door to the various Azure services and emits a set of events for that are generalized and often to get the details of what has occurred, you need to look in the data payload. For example, ARM will emit a success event for each 2xx status code it receives from an Azure service, so a 202 accepted and a 201 created can result in two events being emitted and the only way to see the difference would be in the data payload.

This is a known pain point, and we are working to emit more high-fidelity events that will be clearer and easier to react to in these scenarios. The ideal state will be a change-feed of sorts for the Azure control plane.

Lepine answered 18/3, 2020 at 2:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.