Query where multiple columns have to match a value set simultaneously in EF Core
Asked Answered
L

3

6

The problem

I have a client application which loads a local data file. This data file specifies for each item the Type and Version.

From this file, I compile a list of Type and Version pairs.

var typeVersionSets = datafile.Select(item => new { Type = item.TypeId, Version = item.VersionId }).Distinct();

Note: there are more than these two fields, but for the sake of simplicity I just denote these two.

I also have a SQL Server which runs in the cloud. I need to get all records from a table which meet the value pairs (so the column values must match simultaneously).

I wrote this simple query which cannot be run by EF Core:

List<MyTableRow> MyResult = await dbContext.MyTable
    .Where(dbItem => typeVersionSets.Contains(new { Type = dbItem.TypeId, Version = dbItem.VersionId }))
    .ToListAsync();

I get the following runtime error:

One or more errors occurred. (The LINQ expression 'DbSet().Where(p => __MyTableRowTypeVersions_2.Contains(new { Type = p.TypeId, Version = p.VersionId }))' could not be translated. Either rewrite the query in a form that can be translated, or switch to client evaluation explicitly by inserting a call to 'AsEnumerable', 'AsAsyncEnumerable', 'ToList', or 'ToListAsync'. See https://go.microsoft.com/fwlink/?linkid=2101038 for more information.)

TLDR some details

The MyTable is huge and I cannot afford to download it every time and evaluate the LINQ expression on the client.

The number of typeVersionSets is reasonably small (let's say 10 sets).

Of course, I can loop over typeVersionSets like:

List<MyTableRow> MyResult = new List<MyTableRow>();

foreach (var set in typeVersionSets)
{
    MyResult.AddRange(
            await dbContext.MyTable
                .Where(pp => pp.TypeId == set.Type && pp.VersionId == set.Version)
                .ToListAsync()
            );
}
    

However, this would require 10 database calls.

This code will be executed many times per user and by many users.

Is there a more efficient solution which would result in 1 database call per event without transferring a lot of unnecessary data to the client (or the server).

Some additional notes

I use:

  • .NET (core) 5.0
  • Entity Framework Core version 5.0.9.

In case it matters, I cannot migrate to EF Core 6 since this required a migration to .NET (core) 6.0 which raises a lot of issues which are out of my scope.

Lox answered 17/1, 2022 at 16:4 Comment(5)
what about executing row sql query - learn.microsoft.com/en-us/ef/core/querying/raw-sql, sometimes is betterSelwyn
@Selwyn You mean a raw SQL query. I like the idea. It would be very readable. But how to incorporate the (dynamic) data in my C# list typeVersionSets in this query. Use a string builder and a loop or is there some nicer way?Lox
nicer will be using of string.Join(..., instead of looping. unfortunaly ef raw sql do not undersand from collection parameters, i think !?...,Selwyn
github.com/dotnet/efcore/issues/32092Squander
IMHO joining to a TVP would be more efficient when the collection is large enough.Slay
H
2

I'd be inclined to build a dynamic Expression<Func<MyTableRow, bool>> to represent the filter.

var p = Expression.Parameter(typeof(MyTableRow), "dbItem");

var parts = new List<Expression>();
foreach (var set in typeVersionSets)
{
    var typeIdValue = Expression.Property(p, nameof(MyTableRow.TypeId));
    var typeIdTarget = Expression.Constant(set.Type);
    var typeIdTest = Expression.Equal(typeIdValue, typeIdTarget);
    
    var versionIdValue = Expression.Property(p, nameof(MyTableRow.VersionId));
    var versionIdTarget = Expression.Constant(set.Version);
    var versionIdTest = Expression.Equal(versionIdValue, versionIdTarget);
    
    var part = Expression.AndAlso(typeIdTest, versionIdTest);
    parts.Add(part);
}

var body = parts.Aggregate(Expression.OrElse);
var filter = Expression.Lambda<Func<MyTableRow, bool>>(body, p);

List<MyTableRow> MyResult = await dbContext.MyTable
    .Where(filter)
    .ToListAsync()

Expression Trees (C#) | Microsoft Docs

Haley answered 17/1, 2022 at 16:15 Comment(3)
This works, thank you! Although I must say, this is quite complex and therefore somewhat hard to read what is happening. However, it still works!Lox
What does the magic constant "dbItem" actually do in your first line?Lox
@Lox It just gives a name to the parameter, in case you want to debug the final expression. It doesn't actually make any difference to the code, and you can omit it if you'd prefer.Haley
L
6

You can use this extension:

dbContext.MyTable
    .FilterByItems(typeVersionSets, (pp, set) => pp.TypeId == set.Type && pp.VersionId == set.Version, true)
    .ToListAsync();
Literary answered 17/1, 2022 at 16:50 Comment(3)
Thank you! Do I understand correctly that, under the hood, this answer is the same Richard Deeming's answer? Only this extension provides a clean concise way of writing the Linq query? I like the readability.Lox
Exactly. Just makes things easier. You do not need to work with Expressions, function hides that under hood.Literary
Perfect! This should be a default feature in EF core!Lox
H
2

I'd be inclined to build a dynamic Expression<Func<MyTableRow, bool>> to represent the filter.

var p = Expression.Parameter(typeof(MyTableRow), "dbItem");

var parts = new List<Expression>();
foreach (var set in typeVersionSets)
{
    var typeIdValue = Expression.Property(p, nameof(MyTableRow.TypeId));
    var typeIdTarget = Expression.Constant(set.Type);
    var typeIdTest = Expression.Equal(typeIdValue, typeIdTarget);
    
    var versionIdValue = Expression.Property(p, nameof(MyTableRow.VersionId));
    var versionIdTarget = Expression.Constant(set.Version);
    var versionIdTest = Expression.Equal(versionIdValue, versionIdTarget);
    
    var part = Expression.AndAlso(typeIdTest, versionIdTest);
    parts.Add(part);
}

var body = parts.Aggregate(Expression.OrElse);
var filter = Expression.Lambda<Func<MyTableRow, bool>>(body, p);

List<MyTableRow> MyResult = await dbContext.MyTable
    .Where(filter)
    .ToListAsync()

Expression Trees (C#) | Microsoft Docs

Haley answered 17/1, 2022 at 16:15 Comment(3)
This works, thank you! Although I must say, this is quite complex and therefore somewhat hard to read what is happening. However, it still works!Lox
What does the magic constant "dbItem" actually do in your first line?Lox
@Lox It just gives a name to the parameter, in case you want to debug the final expression. It doesn't actually make any difference to the code, and you can omit it if you'd prefer.Haley
S
1

From https://github.com/dotnet/efcore/issues/32092#issuecomment-2221633692

/// <see>https://github.com/dotnet/efcore/issues/32092#issuecomment-2221633692</see>
/// <see>https://mcmap.net/q/1686724/-query-where-multiple-columns-have-to-match-a-value-set-simultaneously-in-ef-core/78732959#78732959</see>
public static IQueryable<TEntity> WhereOrContainsValues<TEntity, TToCompare>(
    this IQueryable<TEntity> queryable,
    IEnumerable<TToCompare> valuesToCompare,
    IEnumerable<Func<TToCompare, Expression<Func<TEntity, bool>>>> comparatorExpressionFactories) =>
    queryable.Where(valuesToCompare.Aggregate(
        LinqKit.PredicateBuilder.New<TEntity>(),
        (outerPredicate, valueToCompare) => outerPredicate.Or(
            comparatorExpressionFactories.Aggregate(
                LinqKit.PredicateBuilder.New<TEntity>(),
                (innerPredicate, expressionFactory) =>
                    innerPredicate.And(expressionFactory(valueToCompare))))));

Taking the example from https://github.com/dotnet/efcore/issues/32092#issue-1951139044

public class Order
{
     [Key]
     public long TenantId {get;set;}

    [Key]
     public long Id {get;set;}

    //other properties...
}

(long tenantId, long orderId)[] orderIds = //array of (long, long) tuples of the composite ids of the orders that need to be fetched
var orders = await ctx.Orders.Where(o => orderIds.Contains((o.TenantId, o.Id)).ToArrayAsync(); //this doesn't work

it now would be:

var orders = await ctx.Orders.WhereOrContainsValues(orderIds,
[
    orderId => order => orderId.tenantId == order.TenantId,
    orderId => order => orderId.orderId == order.Id 
]).ToArrayAsync();

and translated to SQL like:

SELECT fields FROM Orders
WHERE (TenantId = $1 AND Id = $2)
   OR (TenantId = $3 AND Id = $4)
-- keep go on for each item in orderIds
Squander answered 10/7 at 22:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.