Using Cassandra as an event store

Asked 11/10, 2013 at 15:20 Answered 25/5, 2024 at 16:15

I want to experiment with using Cassandra as an event store in an event sourcing application. My requirements for an event store are quite simple. The event 'schema' would be something like this:

id: the id of an aggregate root entity
data: the serialized event data (e.g. JSON)
timestamp: when the event occurred
sequence_number: the unique version of the event

I am completely new to Cassandra so forgive me for my ignorance in what I'm about to write. I only have two queries that I'd ever want to run on this data.

Give me all events for a given aggregate root id
Give me all events for a given aggregate root if where sequence number is > x

My idea is to create a Cassandra table in CQL like this:

CREATE TABLE events (
  id uuid,
  seq_num int,
  data text,
  timestamp timestamp,
  PRIMARY KEY  (id, seq_num) );

Does this seem like a sensible way to model the problem? And, importantly, does using a compound primary key allow me to efficiently perform the queries I specified? Remember that, given the use case, there could be a large number of events (with a different seq_num) for the same aggregate root id.

My specific concern is that the second query is going to be inefficient in some way (I'm thinking about secondary indexes here...)

Refutation answered 11/10, 2013 at 15:20 Comment(3)

Now that its a year later, I'm curious to know how your event sourcing project using cassandra went. – Panda 30/3, 2015 at 22:19

It seems logical that you also want all events in chronological order to rebuild query models. For that It would seem that cassandra is rather hard to handle. – Neuromuscular 25/11, 2015 at 22:49

In the end I went with using Akka Persistence and the Cassandra journal plugin, thus delegating the schema decision making to the plugin, rather than design my own schema. Akka Persistence works incredibly well as a means to implement DDD using the actor model. By following a single aggregate root per persistent actor approach (single across a whole cluster), it ensures events are written chronologically. I recommend looking up Akka Cluster Sharding for details of ensuring a unique actor per aggregate root across an entire cluster. – Refutation 27/11, 2015 at 7:9

Your design seem to be well modeled in "cassandra terms". The queries you need are indeed supported in "composite key" tables, you would have something like:

query 1: select * from events where id = 'id_event';
query 2: select * from events where id = 'id_event' and seq_num > NUMBER;

I do not think the second query is going to be inefficient, however it may return a lot of elements... if that is the case you could set a "limit" of events to be returned. If that is possible you can use the limit keyword.

Using composite keys seems like a good match for your specific requirements. Using "secondary indexes" do not seem to bring much to the table... unless I miss something in your design/requirements.

HTH.

Tumor answered 11/10, 2013 at 15:57 Comment(1)

Thanks for your advice. I was only bringing up secondary indexes as I wasn't sure whether it was related to compound keys or not. – Refutation 11/10, 2013 at 18:20

What you've got is good, except in case of many events for a particular aggregate. One thing you could do is create a static column to hold "next" and "max_sequence". The idea being that the static columns would hold the current max sequence for this partition, and the "artificial id" for the next partition. You could then, say, store 100 or 1000 events per partition. What you've essentially done then is bucketed the events for an aggregate into multiple partitions. This would mean additional overhead for querying and storing, but at the same time protect against unbounded growth. You might even create a lookup for partitions for an aggregate. Really depends on your use case and how "clever" you want it to be.

Aintab answered 5/3, 2015 at 13:12 Comment(0)

I've been using Cassandra for a very similar scenario (with 100k+ columns per row) and ended with a model close to yours. I also agree with emgsilva that a secondary index probably won't bring much.

There are three things that turned out to be significant for good performance for our event store: Using composite columns, making sure that the columns are in a nicely sortable order (Cassandra sorts data in rows by columns), and using compact storage if possible.

Note that compact storage means you can only have one value column. Hence, you need to make all other columns part of the key.

For you, the schema would be:

CREATE TABLE events (
    id uuid,
    seq_num int,
    timestamp timestamp,
    data text,
    PRIMARY KEY  (id, seq_num, timestamp))
    WITH COMPACT STORAGE;

Metronome answered 27/4, 2015 at 9:51 Comment(0)

Your partition key is too granular, you should create a composite partition key or change it to get better performance for time series modelling. For instance

CREATE TABLE events (
    event_date int,
    id timeuuid,
    seq_num int,
    data text,
    PRIMARY KEY  (event_date, id) );

This way your id will become a clustering column just to guarantee event unicqueness and your partition key (ie. 20160922) can group all events per day. You could change it to month as well. Avoid using uuid use timeuuid instead, it already store timestamp information.

Mutate answered 22/9, 2016 at 8:10 Comment(2)

Although this is a simple idea, this is dangerous as every day only one node and the nodes holding replicas will be under load. Moreover, if there is a day with lots of events (at least one spike) this will fail. – Tanney 5/7, 2018 at 6:37

There is no failure in this approach, depending on the application context it can achieve good performance or not, this idea can be extended to day hours if needed to avoid potential spikes during a day and all data is evenly distributed on the ring, there is no single point of failure and the load is controlled by the driver through the gossip protocol, other Cassandra fine tuning resources can be used to improve performance which is out of scope in this thread. – Mutate 5/7, 2018 at 19:2

The design seems to be in coherence with how Cassandra would store data, 1st part of your primary key, i.e. your 'id' would be used as to partition the data on separate nodes/v-nodes (depending on how your cluster is configured), this will make fetching data for your 1st query very easy for Cassandra as it has to only touch a single partition, now as per the 2nd part of your key is will be a clustering key i.e. is will specify how the data is ordered inside that partition, which is what your 2nd query is all about. remember, as long as all your data is designed in such a way that each query on a table only touches a single partition, you are good to go. Also if you are worried what the 2nd query is going to return a huge amount of data you can always opt for paging inherently provided by Cassandra for range queries.

Louislouisa answered 3/3, 2019 at 11:6 Comment(0)

10 years late to give my input, however I'm also building an event store using cassandra and wanted to share my input.

I believe it might be beneficial to change your primary key to be ((id), seq_num) instead of (id, seq_num), because this change significantly impacts data distribution and access patterns in Cassandra.

With (id, seq_num), both id and seq_num jointly contribute to determining the partition, which can lead to a less efficient distribution of data, especially if seq_num numbers are large or highly variable. This can result in wide partitions and impact performance. On the other hand, using ((id), seq_num) clearly separates the partitioning and clustering responsibilities: id becomes the sole partition key, ensuring all events of a single stream are stored together, while seq_num serves as a clustering key, maintaining the order of events within each stream/aggregate partition.

This structure is more scalable and efficient for event sourcing, as it optimizes both data storage and retrieval by aligning with the common pattern of accessing all events in a specific stream in a sequential manner.

Handicraft answered 14/12, 2023 at 15:11 Comment(0)

Generally you have a good strategy, however there is one caveat I would like to make for any type of time-series pattern in Cassandra: Tombstones. This depends on how you do your event store; if you do roll-ups and delete old data or not. If you don't do any deletions at all, you can ignore and move on. However, if you do any deletions you need to be aware that Cassandra doesn't directly delete an item, instead it adds a tombstone marker to the table that invalidates the entry (i.e. it does insertion of an invalidation marker instead of a deletion). This helps keeping redundancy and scalablity working, but if you have many tombstones, they will have a significant impact on your performance and if you keep deleting at the end of your time-series table, you will end up with many tombstones. Tombstones get consolidated during table compaction, but sometimes a large number of tombstones can appear faster than that.

The solution is to delete entire partitions (a.k.a rows) at a time, rather than delete individual elements. Cassandra will make a single tombstone for the whole partition, so you can cut down on the number of tombstones by several magnitudes. It does require you to be able to only delete at this rougher granularity and "bucket" all the elements of a single time-series into multiple rows.

For instance you could make your primary key something along the lines of ((id, seq_prefix), seq_postfix), then delete on (id, seq_prefix).

Bankhead answered 25/5, 2024 at 16:15 Comment(0)

-7

I am not agrre with your design to save aggregateroot on eventstore.you chould save domainevent for flexibility . i explain eventdomain is the smaalest grained data that making the change of the state of application.aggregateroot dont mismatch with eventstore it is for data exchange or boundedcontext . when you use domain event you can reconstruct your data even aggregateroot with plolygot modeling .you can manage the model for the need of your client and constraints.So you model graphic for links between domainobject and after that you use neo4j ,in addition you model aggregate model and you use documentdatabase.I mean you have the apportunity to change model and use the convenient persistance engine.it is a difference betwenn polygot data and polygot persistence. in your strategie i understand two ways : if you need eventsourcing you model on domainevent and cassandra database. if you need aggregateroot data or model and no eventsourcing , you use documented database and you can retrive the two queries.

you chould eliminate confusion about domain driven design.

Deuterogamy answered 5/3, 2015 at 11:21 Comment(1)

Bit late on replying to this..I think you haven't really read the original post properly, or any of the replies. I appreciate you suggesting that I eliminate my confusion about DDD, although I think you'll find that you're the one who is confused on this occasion. It's clear that the discussion is about storing domain events which can then be replayed to reconstruct an aggregate root – Refutation 27/11, 2015 at 6:58

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags