I've read this nice post from Jonathan Oliver about handling out of order events.
http://blog.jonathanoliver.com/cqrs-out-of-sequence-messages-and-read-models/
The solution that we use is to dequeue a message and to place it in a “holding table” until all messages with a previous sequence are received. When all previous messages have been received we take all messages out of the holding table and run them in sequence through the appropriate handlers. Once all handlers have been executed successfully, we remove the messages from the holding table and commit the updates to the read models.
This works for us because the domain publishes events and marks them with the appropriate sequence number. Without this, the solution below would be much more difficult—if not impossible.
This solution is using a relational database as a persistence storage mechanism, but we’re not using any of the relational aspects of the storage engine. At the same time, there’s a caveat in all of this. If message 2, 3, and 4 arrive but message 1 never does, we don’t apply any of them. The scenario should only happen if there’s an error processing message 1 or if message 1 somehow gets lost. Fortunately, it’s easy enough to correct any errors in our message handlers and re-run the messages. Or, in the case of a lost message, to re-build the read models from the event store directly.
Got a few questions particularly about how he says we can always ask the event store for missing events.
- Does the write side of CQRS have to expose a service for the read side to "demand" replaying of events? For example if event 1 was not received but but 2, 4, 3 have can we ask the eventstore through a service to republish events back starting from 1?
- Is this service the responsibility of the write side of CQRS?
- How do we re-build the read model using this?