Why separate out data access?
From the book, I think the first two pages of the chapter Model Driven Design gives some justification for why you want to abstract out technical implementation details from the implementation of the domain model.
- You want to keep a tight connection between the domain model and the code
- Separating technical concerns helps prove the model is practical for implementation
- You want the ubiquitous language to permeate through to the design of the system
This seems to be all for the purpose of avoiding a separate "analysis model" that becomes divorced from the actual implementation of the system.
From what I understand of the book, it says this "analysis model" can end up being designed without considering software implementation. Once developers try to implement the model understood by the business side they form their own abstractions due to necessity, causing a wall in communication and understanding.
In the other direction, developers introducing too many technical concerns into the domain model can cause this divide as well.
So you could consider that practicing separation of concerns such as persistence can help safeguard against these design an analysis models diverging. If it feels necessary to introduce things like persistence into the model then it is a red flag. Maybe the model is not practical for implementation.
Quoting:
"The single model reduces the chances of error, because the design is now a direct outgrowth of the carefully considered model. The design, and even the code itself, has the communicativeness of a model."
The way I'm interpreting this, if you ended up with more lines of code dealing with things like database access, you lose that communicativeness.
If the need for accessing a database is for things like checking uniqueness, have a look at:
Udi Dahan: the biggest mistakes teams make when applying DDD
http://gojko.net/2010/06/11/udi-dahan-the-biggest-mistakes-teams-make-when-applying-ddd/
under "All rules aren't created equal"
and
Employing the Domain Model Pattern
http://msdn.microsoft.com/en-us/magazine/ee236415.aspx#id0400119
under "Scenarios for Not Using the Domain Model", which touches on the same subject.
How to separate out data access
Loading data through an interface
The "data access layer" has been abstracted through an interface, which you call in order to retrieve required data:
var orderLines = OrderRepository.GetOrderLines(orderId);
foreach (var line in orderLines)
{
total += line.Price;
}
Pros: The interface separates out the "data access" plumbing code, allowing you to still write tests. Data access can be handled on a case by case basis allowing better performance than a generic strategy.
Cons: The calling code must assume what has been loaded and what hasn't.
Say GetOrderLines returns OrderLine objects with a null ProductInfo property for performance reasons. The developer must have intimate knowledge of the code behind the interface.
I've tried this method on real systems. You end up changing the scope of what is loaded all the time in an attempt to fix performance problems. You end up peeking behind the interface to look at the data access code to see what is and isn't being loaded.
Now, separation of concerns should allow the developer to focus on one aspect of the code at one time, as much as is possible. The interface technique removes the HOW is this data loaded, but not HOW MUCH data is loaded, WHEN it is loaded, and WHERE it is loaded.
Conclusion: Fairly low separation!
Lazy Loading
Data is loaded on demand. Calls to load data is hidden within the object graph itself, where accessing a property can cause a sql query to execute before returning the result.
foreach (var line in order.OrderLines)
{
total += line.Price;
}
Pros: The 'WHEN, WHERE, and HOW' of data access is hidden from the developer focusing on domain logic. There is no code in the aggregate that deals with loading data. The amount of data loaded can be the exact amount required by the code.
Cons: When you are hit with a performance problem, it is hard to fix when you have a generic "one size fits all" solution. Lazy loading can cause worse performance overall, and implementing lazy loading may be tricky.
Role Interface/Eager Fetching
Each use case is made explicit via a Role Interface implemented by the aggregate class, allowing for data loading strategies to be handled per use case.
Fetching strategy may look like this:
public class BillOrderFetchingStrategy : ILoadDataFor<IBillOrder, Order>
{
Order Load(string aggregateId)
{
var order = new Order();
order.Data = GetOrderLinesWithPrice(aggregateId);
return order;
}
}
Then your aggregate can look like:
public class Order : IBillOrder
{
void BillOrder(BillOrderCommand command)
{
foreach (var line in this.Data.OrderLines)
{
total += line.Price;
}
etc...
}
}
The BillOrderFetchingStrategy is use to build the aggregate, and then the aggregate does its work.
Pros: Allows for custom code per use case, allowing for optimal performance. Is inline with the Interface Segregation Principle. No complex code requirements. Aggregates unit tests do not have to mimic loading strategy. Generic loading strategy can be used for majority of cases (e.g. a "load all" strategy) and special loading strategies can be implemented when necessary.
Cons: Developer still has to adjust/review fetching strategy after changing domain code.
With the fetching strategy approach you might still find yourself changing custom fetching code for a change in business rules. It's not a perfect separation of concerns but will end up more maintainable and is better than the first option. The fetching strategy does encapsulate the HOW, WHEN and WHERE data is loaded. It has a better separation of concerns, without losing flexibility like the one size fits all lazy loading approach.