Importing data and Event Sourcing
Asked Answered
T

3

2

I am currently working on a monolithic system which I would like to bring into the modern day and incorporate DDD and CQRS. I have been presented with a request to re-write the importing mechanism for the solution and feel this could present a good opportunity to start this re-architecting process.

Currently the process is:

  1. User uploads CSV
  2. System parses CSV and shows each row on screen. Validation takes place for each row and errors/warnings associated with each row
  3. User can modify each line an re-validate all rows
  4. User then selects rows that don't have errors and submits the import
  5. Rows import and any non-selected rows, or rows with errors go into a holding area so they can deal with them at a later date

Additional details for this is that multiple rows could belong to the same entity (E.g. 2 rows could be line items in an order, so would have the same Order Ref).

I was thinking of having an import saga that would generate a bunch of import aggregates (e.g. OrderImportAggregate), and then when the import is submitted those would get converted into the class used across the system currently, which would hopefully become aggregates in their own right when re-architected further down the line! So the saga process would take on something along the lines of:

  1. [EntityType]FileImportUploaded - Stores the CSV
  2. [EntityType]FileImportParsed - Generates n number of [EntityType]Import aggregates.[EntityType]ImportItemCreated events raised/handled
  3. Process would call the validation routine that the current entities go through to generate a list of errors, if any, and store against each item. [EntityType]ImportItemValidated events raised/handled
  4. Each time a row is changed on screen, it calls a web api method for the saga and and item id to update the details and re-validate the row as per point 3.
  5. User submits import, service groups entities together, based on ref for example, they get converted into the current system entity and calls their import/save routine. [EntityType]ImportItemCompleted event raised.
    1. Saga completes when all aggregates are at ImportItemComplete state

As this was my first implementation of CQRS/Event Sourcing/DDD, I wanted to start off on the right foundation, so was wondering if this is a desired approach for this functionaility?

Tic answered 26/4, 2017 at 16:54 Comment(15)
What is this "[EntityType]" that is repeated throughout the question?Upbraiding
There are multiple types that could be imported. Without giving away what I actually work on, the closest thing i can think of is having the ability to import customers and orders from an online store into some back office system. The entity types would be the customer and the order.Tic
OK, what are the invariants that your aggregates are protecting?Upbraiding
They will enforce the invariants of the current classes within the legacy code e.g. customers must have a date of birth above a certain age, a valid name etc. But will also hold the validation errors, if any, and also warnings, such as a warning to say the entity already exists in the system. These don't fit within the domain of the current legacy classes for customers/orders etc.Tic
I'm trying to understand tour business and what is the point of this import process and how it relates to those aggregates.Upbraiding
DDD is about the business, and without a deep understanding of the business processes, DDD cannot be applied.Upbraiding
Clients have another system they will import the data from so it can undertake additional processing in our system. I have read other questions about this and the popular opinion is it's not a DDD concern as such and would just be an application service that would orchestrate the parsing, but none of the questions outline the additional requirement of persisting the pre-parsed data for editing before triggering the import, which is where i feel there is a need for an aggregateTic
In general, aggregates are used to protect business invariants. You don't necessarily have to have Aggregates if you don't need them. From what I see you have more of a UI concern and that should be resolved with a more CRUD solution. You could still use other patterns from DDD like the Ubiquitous language (the most important one!), bounded contexts (for example you could have a separate bounded context for importing), value objects etc.Upbraiding
That sounds fair enough, i currently have POCOs for the import objects, but wondered how conversion from that to a domain model would work, as i would need to validate the import object beforehand, but extracting the validation away from the domain model would start making the domain model anaemic, wouldn't it?Tic
It depends on the type of validation. The model may be anemic if the business rules are anemic. If the validation is done by the user inspecting visually some data on screen then press save then crud is better for that bounded contextUpbraiding
It's not all superficial validation (e.g. emails in correct format), there is also validation based on other factors (such as if they are male, they can't have a title of Mrs....But my domain has a bit more complexity than that!).Tic
I understand, but this is a verification made by a human, right?Upbraiding
This import staging is so users can check an import and see errors before submitting. The user can correct errors, yes, but it's only the same as creating/editing a record normally (e.g. going to a customer edit or add page), just in bulk.Tic
Then you don't need a Saga for this. You need a UI componentUpbraiding
Let us continue this discussion in chat.Tic
U
2

I suggest that you break your domain into two separate sub-domains implemented as to separate bounded context, one bounded context being the Import bounded context (ImportBC) and the other being the receiving bounded context (ReceivingBC, the actual name is not know to me, please replace it accordingly).

Then, in the Import BC you should implement using the CRUD style, having an entity for each import file and use a persistence to remember the progress on the validation and import process (this entity holds a list of not-yet imported items). After each item is validated by a human, a command could be sent to the aggregates in the ReceivindBC to test if the aggregate is valid according to the business rules, but without committing the changes to the repository! You do this so that the human user would know if the item is indeed valid and to enable/disable an import button. In this way you don't duplicate the validation logic inside the two bounded contexts. When the user actually presses the import button send the import command to the aggregate in the ReceivingBC and you actually commit the changes to the repository. Also, you remove the import item from the import file CRUD entity.

This technique of sending commands but without actually persisting into the repository is useful in helping the user experience in the UI (without duplicating logic inside the UI) and it is doable if you follow the DDD best practices and design your aggregates to be pure, side-effect free objects (to be Repository agnostic, to not know of their existing, to not use them at all!).

Upbraiding answered 27/4, 2017 at 19:24 Comment(1)
That's sounds like a sound approach and i think resembles what i had in mind, but I think i was trying to over-engineer things by being too keen to implement modern architecture for the first time! Thank you for bringing me back down to earth and pointing me in the right direction!Tic
B
1

Well first of all you have to ask yourself why are you using CQRS. CQRS is the heavy 18 wheeler amongst architecture. I know of 2 good reasons that scream CQRS

1) You need to support undo functionality

2) in the future when new requirements are implemented you want to apply those to past data too.

The part of the requirements that you are describing however feels very much like crud. (You import a set of rows, you list a set of rows, you edit those rows and the ones marked as completed are then deleted from their input state and converted into some other kind of entity.

If you feel there is a lot of complexity describing the specific entities and the validation rules that apply then DDD would be a good fit. but still i would consider scaling it down and build a simle mvc style app to implement this (depending what else is required of this project)

and even if this were part of a larger domain i would suggest a microservices approach where this would be a completely standalone import application (and in that case you could still raise a ImportCompleted Event and put it on a service bus with multiple other applications listening to that event)

NOTE: CQRS is not event sourcing, cqrs is separating a command (update) stack from a query stack. It's often applied in combination with event sourcing. But having events that pop up everywhere can be a pain to maintain especially since it's often less obvious who is raising the event and if events have interactions on eachother (what happens to an order if both a ordercompleted and ordercanceled event are raised, possibly with timing issues which one is handled first)

Boart answered 27/4, 2017 at 13:27 Comment(3)
I understand the difference between CQRS and Event Sourcing but probably muddied the waters with my approach. Having a history of how that entity came to be and changed would be desirable and to allow undo functionality (although maybe not for this particular context). I feel a bounded context for imports/exports, as mentioned by Constantin GALBENU, would be a good approach. I guess I'm over-eager to introduce modern architecture to a big ball of mud and trying to over-engineer, so will have a re-think!Tic
"I know of 2 good reasons that scream CQRS 1) You need to support undo functionality 2) in the future when new requirements are implemented you want to apply those to past data too." No thats 2 good reasons for eventsourcing; CQRS just separates your reads from your writes.Whinstone
true, but cqrs is often implemented with events/event sourcing. If you just separate read from writes, on a single db, then i see no advantage of cqrsBoart
G
0

I'm not a DDD expert but this is my thoughts on approaching this. I wouldn't use a seperate bounded context because it feels to me the import of domain objects can ideally be in the same bounded context as the one they are a part of. Keen to hear from experts why it would be wrong!

  1. Parse the csv into an aggregate representing the data import and persist this (to the staging area / tables etc). We can load this aggregate from here in future. The parsing of CSV file to create this aggregate could be modelled as a command "CreateDataImportFromCsvFile" etc.
  2. Build a UI that loads this aggregate and displays it. The aggregate can contain a list of domain objects "customer import items" and each "customer import item" can contain an "IsSelected" property as well as the domain object being imported I.e the "customer" domain object itself. This means you don't duplicate validation rules as you are using the actual Domain objects you intend to import. You hydrate those objects and display them in the UI. When the user clicks the import button, you issue a command. You handle that command by looping through each selected and valid "import item" on the aggregate and calling Save() on its Domain model, and then marking the import item as processed. Ideally do this all within an outer transaction scope (depends on whether you want atomicity vs eventual consistenty etc). Your UI can then optionally not display processed import items or it can display them in a disabled state or whatever depending on whether it is useful for the user to also be able to see what has actually been processed so far vs what's remaining.
Groats answered 18/7, 2021 at 10:0 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.