repositories and querying with raw sql?

Asked 27/2, 2012 at 20:35 Answered 29/2, 2012 at 18:40

Solved c#domain-driven-design repository repository-pattern ddd-repositories

I am struggling to understand how to best query a repository.

The three factors that are throwing me through a loop right now are:

Return type of data
Columns to run query on
Number of records to return

Point 1

In regards to question one:

I have Repositories with lot of methods that return a combination of both Entities and scalar values. This seems to lead to "method explosion". Should I always return an Entity object? How should I query for objects where I only need one column?

Point 2 When running a query should I include every column in the table even if I only need one, or two columns? If I create specific queries for this it leads to more methods in the Repository

Point 3 How should I provide conditions for the query? I read about Specifications, but my understanding is that you loop through the returned records and filter out the ones that pass into a new collection. This doesn't seem like a good idea performance wise. Right Now I just make a new method in the Repo like getNameById() which encapsulates the condition.

Please not that I am not using an ORM, I just have raw sql in my Repositories.

Update

Point 1: Based on the answers and a bit more research would this be a good implementation?

Right now I have a large repository that return a mix of scalar and entity type objects (all same entity). I'm thinking I could reduce this greatly if I just use a GetUser(userId) method and forget writing methods that just return single column values.

For example if I need to return a user name I could call the GetUser(userId) method that hydrates the User object and then in the service layer just filter it down to the username.

Another way would be to use some sort of QueryBuilder class I could pass into the Repository which could be parsed to generate the proper sql.

Point 2

Looking back this is pretty similar to point one and my current solution would be to just grab all table fields. It's a tradeoff between performance and maintainability.

Point 3

I would need to provide some sort of where clause. I'm not sure if this make sense doing via Specification or just a sql string. My current solution is to make new methods for these types, but I would like something more generic for the Repository

Overall, still researching into this... I'd love to hear more input into this or links to books or references that kind of tie this all together.

Veliger answered 27/2, 2012 at 20:35 Comment(3)

Do you create Sql command dynamically in your class? – Benzine 27/2, 2012 at 20:43

Raw SQL and no ORM, ha? Like living on the edge, do we? Seriously, why do you want to deal with raw sql? – Overturf 27/2, 2012 at 21:12

@kerezo - I have sql parameters and use addwith value – Veliger 27/2, 2012 at 21:30

I have Repositories with lot of methods that return a combination of both Entities and scalar values. This seems to lead to "method explosion". Should I always return an Entity object? How should I query for objects where I only need one column?

You can fight repository method explosion similar to how you would fight other SRP violations. You can create another repository for the same entity. See this answer to a similar question.

When running a query should I include every column in the table even if I only need one, or two columns? If I create specific queries for this it leads to more methods in the Repository

This is not a DDD question. Domain driven design does not deal with 'rows and columns'. There is always some redundancy in how much data you load to 'hydrate' the domain object, but you have to measure whether this really affects your performance. If this is really a performance bottleneck than it maybe a symptom of incorrect domain model.

How should I provide conditions for the query? I read about Specifications, but my understanding is that you loop through the returned records and filter out the ones that pass into a new collection. This doesn't seem like a good idea performance wise. Right Now I just make a new method in the Repo like getNameById() which encapsulates the condition.

This again is a data access issue. Nothing in DDD says that your repository can not convert Specification to a SQL query. It is up to you whether you do this or iterate over records in memory (as long as repository consumer only sees Specification and Repository and stays unaware of the actual implementation).

Regarding 'Raw SQL vs. ORM in DDD' you may find this answer interesting.

Mckale answered 27/2, 2012 at 21:48 Comment(3)

Could you point to some answers or tutorials that address this data access issue? – Veliger 27/2, 2012 at 22:51

DDD book itself has an example about Specification that is converted to raw SQL because of performance considerations. – Mckale 27/2, 2012 at 22:57

Is that the blue one by Eric Evans? Wouldn't happen to know the page number of that example? :) – Veliger 28/2, 2012 at 0:50

I agree with everything Dmitry says, but perhaps think you should have a read of CQRS.

I used to ask similar questions when getting started with DDD (regarding 'method explosion', not your SQL issues), and this lead me to CQRS. Personally, I don't really see how DDD is practical without it, and it answers a lot of these sorts of questions when it comes to querying data. Using it's principles what I'd suggest is:

Only use domain repositories when committing a transaction. That is, you don't use repositories to display data in the UI. You only fetch aggregates from your repository when you want to perform an operation against them.
Your repositories only return aggregates, not individual entities separately. This makes sense now as we are only using repositories in a transactional sense, and entities can only be mutated via atomic operations and persisted by the aggregate as a whole.
You create separate repositories (or 'query services') which provide tailor made queries and data types for whatever data you need. These can return dumb DTOs with no logic.

This keeps your proper domain & repositories clean, whilst providing the means to create a thin data access layer that provides high performing queries.

Regarding the specification pattern: rather than converting it to a SQL query in code, you could provide public properties on the specification that represent the criteria. These values could then be added in the where clause of your SQL or sent as parameters to a SPROC.

Windowshop answered 28/2, 2012 at 9:18 Comment(0)

First of all, you haven't really explained what you are using all these queries for. Chances are it's for user interface needs. If so, there's no need to jump through all these hoops (service->repository->domain->dto->client), just query the database as directly as possible. And what do you know, gone are the questions whether you can query for scalars or just the columns you need. Just use plain sql and return what you need. Don't create abstractions that cause friction.

Overturf answered 27/2, 2012 at 21:35 Comment(2)

All I'm looking for is a "best practice" approach to querying repositories so I don't have a million methods and it somehow factors in some decent performance as well. – Veliger 27/2, 2012 at 22:54

Start by not adding them to the repository in the first place if they don't serve the purpose of the domain. Move them over to another facade that represents the queries the user interface needs. Go as fast from client to facade to db to facade to client. Use simple DTOs to represent state that needs to be transported to the client. – Overturf 28/2, 2012 at 16:15

Chobo,

We need remember two things about Repository [Fowler PoEAA][Evans DDD] pattern:

Use a Repository pattern as a simple collection. Repository abstract these infrastructure details because aren't from the domain.
If a Repository pattren is a collection, it is a cluster of objects of the same type.

Two others types may help your Repository: the Query Object [Fowler PoEAA] and Data Mapper [Fowler PoEAA] patterns. Query Object pattern aggregate criterias using object oriented methods and know how to translate them as a SQL statement. Data Mapper pattern know map the object states from application and the table columns from databases.

You can use Lazy Load pattern [Fowler PoEAA] to mitigate the problem of large object in memory.

Success for you!

Sump answered 29/2, 2012 at 18:40 Comment(1)

I think the Query Object would help a lot with the problem of querying with different criteria. I'm guessing ORM's have this built in. Thanks for the list of patterns that address these issues! – Veliger 29/2, 2012 at 23:20

Recommended topics

Hot tags