Kubernetes deployment strategy using CQRS with dotnet & MongoDb

Asked 17/6, 2021 at 18:41 Answered 4/2, 2023 at 3:42

Solved mongodb kubernetes asp.net-web-api cqrs

I am re-designing a dotnet backend api using the CQRS approach. This question is about how to handle the Query side in the context of a Kubernetes deployment.

I am thinking of using MongoDb as the Query Database. The app is dotnet webapi app. So what would be the best approach:

Create a sidecar Pod which containerizes the dotnet app AND the MongoDb together in one pod. Scale as needed.
Containerize the MongoDb in its own pod and deploy one MongoDb pod PER REGION. And then have the dotnet containers use the MongoDb pod within its own region. Scale the MongoDb by region. And the dotnet pod as needed within and between Regions.
Some other approach I haven't thought of

Southwick answered 17/6, 2021 at 18:41 Comment(2)

What is the write-side like? – Groundwork 20/6, 2021 at 15:48

Write side is a postgresql db on a dedicated VM using EF core – Southwick 20/6, 2021 at 19:37

I would start with the most simple approach and that is to place the write and read side together because they belong to the same bounded context.

Then in the future if it is needed, then I would consider adding more read side or scaling out to other regions.

To get started I would also consider adding the ReadSide inside the same VM as the write side. Just to keep it simple, as getting it all up and working in production is always a big task with a lot of pitfalls.

I would consider using a Kafka like system to transport the data to the read-sides because with queues, if you later add a new or if you want to rebuild a read-side instance, then using queues might be troublesome. Here the sender will need to know what read-sides you have. With a Kafka style of integration, each "read-side" can consume the events in its own pace. You can also more easily add more read-sides later on. And the sender does not need to be aware of the receivers.

Kafka allows you to decouple the producers of data from consumers of the data, like this picture that is taken form one of my training classes:

In kafka you have a set of producers appending data to the Kafka log:

Then you can have one or more consumers processing this log of events:

Groundwork answered 21/6, 2021 at 7:55 Comment(10)

Yes, good point, I may be making things more "fancy" than they need to be at this point. Tho I would still like to hear how my scenario might be done...with eye toward an eventual more fancy Kubernetes deployment. I want to be able to serve users as fast as possible in difference zones around the world, hence putting the readside closer to where the app is being consumed. – Southwick 21/6, 2021 at 11:46

Do you use event-sourcing? or how do you intend to replicate the data from the write to the read side? Will you only have one write instance or multiple write-instances as well? The issue you have is the eventual-consistency, how much "lag" will your users tolerate. – Groundwork 21/6, 2021 at 12:24

Btw, if yoiu want to see some real CQRS/ES code, I got a bunch of that on cqrs.nu – Groundwork 21/6, 2021 at 12:27

also, if you are on a postgress I would consider martendb.io – Groundwork 21/6, 2021 at 12:38

Regarding event sourcing, I am relying on Azure Service bus to dispatch Domain Events. Definitely only one ONE write instance -- just a VM dedicated to postgres not within a Kubernetes cluster. I am able to tolerate a little latency on the readside. This for the link to the cqrs code, will give it a look! Will also check out martendb! – Southwick 22/6, 2021 at 1:11

Dont know Kafka, but will look into it. Maybe dumb question here: Why use Kafka vs Azure Service Bus? – Southwick 22/6, 2021 at 13:59

An illustrated intro to Kafka can be found here gentlydownthe.stream I will add a picture about kafka to the answer – Groundwork 22/6, 2021 at 14:13

Using Kafka, you can add read-sides in the future and they can just start processing the events from start, and the producer (The write side) does not need to know about the consumers. Most cloud providers do provide one or more Kafka like services. The main difference with a service bus is that in the bus, processed messages are generally removed, while in Kafka, you keep the data in the "log". – Groundwork 22/6, 2021 at 14:20

Let us continue this discussion in chat. – Southwick 22/6, 2021 at 20:6

Any further questions for this bounty? – Groundwork 26/6, 2021 at 14:3

It has been almost 2 years since I posted this question. Now with 20-20 hindsight I thought I would post my solution. I ended up simply provisioning an Azure Cosmos Db in the region where my cluster lives, and hitting the Cosmos Db for all my query-side requirements.

(My cluster already lives in the Azure Cloud)

I maintain one Postges Db in my original cluster for my write-side requirements. And my app scales nicely in the cluster.

I have not yet needed to deploy clusters to new regions. When that happens, I will provision a replica of the Cosmos Db to that additional region or regions. But still just one postgres db for write-side requirements. Not going to bother to try to maintain/sync replicas of the postgres db.

Additional insight #1. By provisioning the the Cosmos Db separately from my cluster (but in the same region), I am taking the load off of my cluster nodes. In effect, the Cosmos Db has its own dedicated compute resources. And backup etc.

Additional insight #2. It is obvious now but wasnt back then, that tightly coupling a document db (such as MongoDb) to a particular pod is...a bonkers bad idea. Imagine horizontally scaling your app and with each new instance of your app you would instantiate a new document db. You would quickly bloat up your nodes and crash your cluster. One read-side document db per cluster is an efficient and easy way to roll.

Additional insight #3. The read side of any CQRS can get a nice jolt of adrenaline with the help of an in-memory cache like Redis. You can first see if some data is available in the cache before you hit the docuement db. I use this approach for data such as for a checkout cart, where I will leave data in the cache for 24 hours but then let it expire. You could conceivably use redis for all your read-side requirements, but memory could quickly become bloated. So the idea here is consider deploying an in-memory cache on your cluster -- only one instance of the cache -- and have all your apps hit it for low-latency/high-availability, but do not use the cache as a replacemet for the document db.

Southwick answered 4/2, 2023 at 3:42 Comment(0)

Recommended topics

Hot tags