Event sourcing: handle event schema changing
Asked Answered
M

4

17

As a software project built on an event sourcing architecture evolves, the event schema (or event types) are among the things most likely to change over time.

One of the benefits event sourcing architecture offers is that it's able to "replay all events" and build a current state from old events.

What if I need to change the event schema (event types), by adding or removing an attribute, or by changing the semantics of an attribute? The current service implementation won't be able to process old events, because they use the old schema (they do not contain an attribute, or the semantics have changed).

Any ideas for how to handle this situation?

Metastasize answered 1/2, 2018 at 14:10 Comment(0)
T
26

This is been the topic of the research that I have been doing in the past 2 years. We found 5 techniques that you can use to handle schema evolution:

  1. Versioned events: never change existing events, always introduce new events.
  2. Weak schema: handle missing attributes or superfluous attributes gracefully.
  3. Upcasters: transform events at runtime, before the application has to process them.
  4. In-place transformation: just change what you need to change in your store.
  5. Copy-transform: copy your whole store into a new store.

This is all summarized in our paper "The Dark Side of Event Sourcing" (https://www.movereem.nl/files/2017SANER-eventsourcing.pdf)

We have also researched surrounding areas:

  • Pruning of your event store to keep the size maintainable.
  • How to keep your read models in sync.
  • How DDD can help you to prevent schema evolution.

The last part is not yet public. But you can find slides of a talk I gave yesterday at DDDEurope here: https://speakerdeck.com/overeemm/dddeurope-2018-event-sourcing-after-launch

Tanh answered 2/2, 2018 at 7:54 Comment(6)
What do you think about using schema.org?Metastasize
Never seen it before. It looks interesting, but personally, I would only use it as inspiration. I would not want to align my product with the schema that is probably more generic than it needs to be.Tanh
You can extend and modify types. It's already a common resource in order to write hypermedia rest services in order to share vocabulary between clients and services...Metastasize
Sounds interesting, but the link is brokenDesired
@Desired I fixed the link.Tanh
found the talk here youtube.com/watch?v=JzWJI8kW2kcBobbyebobbysocks
H
8

Any ideas about how to handle this situation?

You design for it. You make backwards compatibility a first class concern when figuring out your event schema, and you get that right early, so that later changes are easy.

See Versioning in an Event Sourced System, by Greg Young.

The basic idea: you never mutate the semantics of a schema element. You can extend a schema by adding a new optional element, and you can deprecate optional elements.

When that doesn't suffice: you create a new schema with better design, and you migrate your data to the new schema.

What do you think about using schema.org?

I think the schema identifiers there are an excellent starting point, and they really open up the possibility of sharing with domain agnostic components some of the details of your messages. For instance, http://schema.org/telephone is a great way to communicate to a generic presentation engine that the enclosed data is suitable for dialing.

So by all means, design your schema with those types in mind, and stay aligned with them for as long as you can.

But when you do diverge, give your schema a new identifier.

Heterochromatic answered 1/2, 2018 at 14:42 Comment(1)
What do you think about using schema.org?Metastasize
P
2

What about if I need to change event schema (event type), adding or removing an attribute, or changing the semantic of a attribute?

Use Avro! It has a well defined evolution process for when fields can be added, modified, removed. You can think of it as a more compact version of JSON, and it has support for all major programming languages.

You can pair this with the Confluent Platform's Avro Schema Registry which will allow you to have a source of truth and validation for your data schemas. Plus, you can use the Kafka Avro SerDes to manage Kafka message schemas within your topics.

Particolored answered 3/2, 2018 at 19:0 Comment(0)
G
0

Any ideas for how to handle this situation?

The short answer is employ event versioning and a certain migration process. Keep in mind these are two different problems although related.

The longer answer is in the following article.

Hope it helps, cheers

Gladi answered 20/6, 2021 at 9:41 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.