NOTE
I'm not asking if I should use the Repository pattern, I care about the How. Injecting persistence-related objects into domain classes is not an option for me: it makes Unit Testing impossible (and no, tests using in-memory databases are NOT Unit Tests, as they cover many different classes without isolation), it couples the domain logic with the ORM and it brakes many important principles I practice, like Persistence Ignorance, Separation of Concerns, and others, whose benefits you're welcome to search online. Using EF Core "correctly" is not nearly as important to me as keeping the business logic isolated from external concerns, which is why I'll settle for a "hacky" usage of EF Core if it means the Repository won't be a leaky abstraction anymore.
Original Question
Let's assume the repository's interface is the following:
public interface IRepository<TEntity>
where TEntity : Entity
{
void Add(TEntity entity);
void Remove(TEntity entity);
Task<TEntity?> FindByIdAsync(Guid id);
}
public abstract class Entity
{
public Entity(Guid id)
{
Id = id;
}
public Guid Id { get; }
}
Most of the EF Core implementations I saw online did something like:
public class EFCoreRepository<TEntity> : IRepository<TEntity>
where TEntity : Entity
{
private readonly DbSet<TEntity> entities;
public EFCoreRepository(DbContext dbContext)
{
entities = dbContext.Set<TEntity>();
}
public void Add(TEntity entity)
{
entities.Add(entity);
}
public void Remove(TEntity entity)
{
entities.Remove(entity);
}
public async Task<TEntity?> FindByIdAsync(Guid id)
{
return await entities.FirstOrDefaultAsync(e => e.Id == id);
}
}
The changes are committed in another class, in an implementation of the Unit of Work pattern.
The problem I have with this implementation is that it violates the definition of a repository as a "collection-like" object. Users of this class would have to know that the data is persisted in an external store and call the Save()
method themselves. The following snippet won't work:
var entity = new ConcreteEntity(id: Guid.NewGuid());
repository.Add(entity);
var result = await repository.FindByIdAsync(entity.Id); // Will return null
The changes should obviously not be committed after every call to Add()
, because it defeats the purpose of the Unit of Work, so we end up with a weird, not very collection-like interface for the repository.
In my mind, we should be able to treat a repository exactly like we would treat a regular in-memory collection:
var list = new List<ConcreteEntity>();
var entity = new ConcreteEntity(id: Guid.NewGuid());
list.Add(entity);
// No need to save here
var result = list.FirstOrDefault(e => e.Id == entity.Id);
When the transaction scope ends, the changes can be committed to the DB, but apart from the low-level code that deals with the transaction, I don't want the domain logic to care about when the transaction is committed. What we can do to implement the interface in this fashion is to use the DbSet's Local
collection in addition to the regular DB query. That would be:
...
public async Task<TEntity?> FindByIdAsync(Guid id)
{
var entity = entities.Local.FirstOrDefault(e => e.Id == id);
return entity ?? await entities.FirstOrDefaultAsync(e => e.Id == id);
}
This works, but this generic implementation would then be derived in concrete repositories with many other methods that query data. All of these queries will have to be implemented with the Local
collection in mind, and I haven't found a clean way to enforce concrete repositories not to ignore local changes. So my question really boils down to:
- Is my interpretation of the Repository pattern correct? Why is there no mention of this problem in other implementations online? Even Microsoft's implementation (which is a bit outdated, but the idea is the same) in the official documentation website ignores local changes when querying.
- Is there a better solution to include local changes in EF Core than manually querying both the DB and the
Local
collection every time?
UPDATE - My Solution
I ended up implementing the second solution suggested by @Ronald's answer. I made the repository save the changes to the database automatically, and wrapped every request in a database transaction. One thing I changed from the proposed solution is that I called SaveChangesAsync
on every read, not write. This is similar to what Hibernate
already does (in Java). Here is a simplified implementation:
public abstract class EFCoreRepository<TEntity> : IRepository<TEntity>
where TEntity : Entity
{
private readonly DbSet<TEntity> dbSet;
public EFCoreRepository(DbContext dbContext)
{
dbSet = dbContext.Set<TEntity>();
Entities = new EntitySet<TEntity>(dbContext);
}
protected IQueryable<TEntity> Entities { get; }
public void Add(TEntity entity)
{
dbSet.Add(entity);
}
public async Task<TEntity?> FindByIdAsync(Guid id)
{
return await Entities.SingleOrDefaultAsync(e => e.Id == id);
}
public void Remove(TEntity entity)
{
dbSet.Remove(entity);
}
}
internal class EntitySet<TEntity> : IQueryable<TEntity>
where TEntity : Entity
{
private readonly DbSet<TEntity> dbSet;
public EntitySet(DbContext dbContext)
{
dbSet = dbContext.Set<TEntity>();
Provider = new AutoFlushingQueryProvider<TEntity>(dbContext);
}
public Type ElementType => dbSet.AsQueryable().ElementType;
public Expression Expression => dbSet.AsQueryable().Expression;
public IQueryProvider Provider { get; }
// GetEnumerator() omitted...
}
internal class AutoFlushingQueryProvider<TEntity> : IAsyncQueryProvider
where TEntity : Entity
{
private readonly DbContext dbContext;
private readonly IAsyncQueryProvider internalProvider;
public AutoFlushingQueryProvider(DbContext dbContext)
{
this.dbContext = dbContext;
var dbSet = dbContext.Set<TEntity>().AsQueryable();
internalProvider = (IAsyncQueryProvider)dbSet.Provider;
}
public TResult ExecuteAsync<TResult>(Expression expression, CancellationToken cancellationToken = default)
{
var internalResultType = typeof(TResult).GenericTypeArguments.First();
// Calls this.ExecuteAsyncCore<internalResultType>(expression, cancellationToken)
object? result = GetType()
.GetMethod(nameof(ExecuteAsyncCore), BindingFlags.NonPublic | BindingFlags.Instance)
?.MakeGenericMethod(internalResultType)
?.Invoke(this, new object[] { expression, cancellationToken });
if (result is not TResult)
throw new Exception(); // This should never happen
return (TResult)result;
}
private async Task<TResult> ExecuteAsyncCore<TResult>(Expression expression, CancellationToken cancellationToken)
{
await dbContext.SaveChangesAsync(cancellationToken);
return await internalProvider.ExecuteAsync<Task<TResult>>(expression, cancellationToken);
}
// Other interface methods omitted...
}
Notice the use of IAsyncQueryProvider
, which forced me to use a small Reflection hack. This was required to support the asynchronous LINQ methods that comes with EF Core.
database
is implementation detail, butpersistence
is not, but I may be wrong here. I think the question is interesting one and will watch for a decent answer :) – Preclinicalif
you're talking about EF repos only. Your db can be text/csv file, witch has got different definitions for repo (generally, repo is a repo, no matter for text, flat or relational db). EF haschange tracking
, while Foo has not. Some have transaction capability some don't. Even withspecification pattern
, the behavior/definition of repos should differ from EF to Mongo or etc. I aim to say it's theabstraction
that defines the level of expectations from a repo. Therefore, we can't sayCorrect Repository Pattern
is exactlythis or that
– Formation