MoreLinq maxBy vs LINQ max + where
Asked Answered
C

1

5

I am using EF5 with the MoreLinq extenstion, while testing my program in production (very big database), I found out that the line:

var x = db.TheBigTable.MaxBy(x => x.RecordTime);

Takes very long time (RecordTime is a non-indexed datetime)

Is that because MaxBy always runs on the client side (and firstly gets ALL records from the database)?

Capacious answered 21/10, 2013 at 0:32 Comment(1)
MoreLINQ is entirely client-side; you can't really use it with EF.Electroplate
C
9

Here is the signature of the MaxBy extension method:

public static TSource MaxBy<TSource, TKey>(this IEnumerable<TSource> source,
    Func<TSource, TKey> selector)
{
    return source.MaxBy(selector, Comparer<TKey>.Default);
}

It returns the maximal element (based on the given projection) of an IEnumerable<T>, not an IQueryable<T>. So the results of the query db.TheBigTable are indeed all loaded into memory first, and then they are iterated to find the maximum.

Constructive answered 21/10, 2013 at 0:38 Comment(1)
MaxBy() does need to stream all the items from the data source to do the comparisons client side, but it doesn't need to load all the items into memory first before it starts iterating. It can stream the items one by one, and for each item it only needs to compare it to the max found so far before looking at the next item. So it does have the performance penalty of sending all that data from the server to the client (instead of computing the max on the server), but it does not need to hold them all in memory at one time.Reld

© 2022 - 2024 — McMap. All rights reserved.