IQueryable OrderBy with Func<TModel, TValue> : what's happening
Asked Answered
B

4

7

With the code given from this question OrderBy is not translated into SQL when passing a selector function

Func<Table1, string> f = x => x.Name;
var t = db.Table1.OrderBy(f).ToList();

The translated SQL is:

SELECT 
[Extent1].[ID] AS [ID], 
[Extent1].[Name] AS [Name]
FROM [dbo].[Table1] AS [Extent1]

OK.

I can understand that the code compiles : IQueryable inherits from IEnumerable, which have an OrderBy method taking a Func<TModel, TValue> as parameter.

I can understand that the ORDER BY clause is not generated in SQL, as we didn't pass an Expression<Func<TModel, TValue>> as the OrderBy parameter (the one for IQueryable)

But what happens behind the scene ? What happens to the "wrong" OrderBy method ? Nothing ? I can't see how and why... Any light in my night ?

Becharm answered 31/8, 2012 at 20:21 Comment(0)
C
5

Because f is a delegate rather than an expression, the compiler picks the IEnumerable OrderBy extension method instead of the IQueryable one.

This then means that all the results are fetched from the database, because the ordering is then done in memory as if it were Linq to Objects. That is, in-memory, the ordering can only be done by fetching all the records.

Of course, in reality this still doesn't actually happen until you start enumerating the result - which in your case you do straight away because you eager-load the result with your call to ToList().

Update in response to your comment

It seems that your question is as much about the IQueryable/IEnumerable duality being 'dangerous' from the point of view of introducing ambiguity. It really isn't:

t.OrderBy(r => r.Field)

C# sees the lambda as an Expression<> first and foremost so if t is an IQueryable then the IQueryable extension method is selected. It's the same as a variable of string being passed to an overloaded method with a string and object overload - the string version will be used because it's the best representation.

As Jeppe has pointed out, it's actually because the immediate interface is used, before inherited interfaces

t.AsEnumerable().OrderBy(r => r.Field)

C# can't see an IQueryable any more, so treats the lambda as a Func<A, B>, because that's it's next-best representation. (The equivalent of only an object method being available in my string/object analogy before.

And then finally your example:

Func<t, string> f = r => r.Field;
t.OrderBy(f);

There is no possible way that a developer writing this code can expect this to be treated as an expression for a lower-level component to translate to SQL, unless the developer fundamentally doesn't understand the difference between a delegate and an expression. If that's the case, then a little bit of reading up solves the problem.

I don't think it's unreasonable to require a developer to do a little bit of reading before they embark on using a new technology; especially when, in MSDN's defence, this particular subject is covered so well.

I realise now that by adding this edit I've now nullified the comment by @IanNewson below - but I hope it provides a compelling argument that makes sense :)

Crafton answered 31/8, 2012 at 20:25 Comment(7)
Simple and straightforward explanation, +1Obsess
Fine and clear, thx. But (real question) : isn't this kind of interface inheritance design a little bit "dangerous" ? I mean, everything compiles perfectly, acts perfectly, but... behind, that's not what you meant. We can say that you should know your msdn, but you could easily make wrong things, no ?Wildon
Well, I can see why you might say that, yes, but on the flip side I think it kind of falls into the 'schoolboy error' category for the Linq part of the language in the same way that if(a = b) does. It's also heavily mitigated by the fact that the language construct () => foo is treated by the language as being more like an Expression<Func<A, B> than a Func<A, B> and you never normally think about it. In your case you are expressly passing a Func<A,B> so realistically you are choosing the interface :)Crafton
@RaphaëlAlthaus props for having the courage to ask this question; I'm sure you were worried about whether you'd just get a load of 'it just is, stupid!' responses (a phenomenon that has been known to occur on occasion here on SO!).Crafton
Thanks for your edit. And I wasn't too worried, I'm a big boy now ;) But I must say that the "unless the developer fundamentally doesn't understand the difference between a delegate and an expression" looks quite optimistic... (depending, maybe, of the definition you give to "a developer"). In practice, I'm sure you also see "Great, it compiles and give the result, fine. Next." philosophy.Wildon
@RaphaëlAlthaus I agree that this is a concern, but I am not sure I'd call it dangerous, as you did in your comment to Jon Skeet's answer. Making this mistake doesn't affect the correctness of the program (as if (a = b) might), only its performance. If performance is a problem, the developer should profile the program, which would expose the error so it can be corrected.Ives
Interseting (the Update part)! I have a comment on why the method taking an expression tree is used, and not the method taking a simple delegate. But the comment got too long, so I posted it as a new answer.Lytta
A
3

But what happens behind the scene?

Assuming db.Table1 returns an Table<Table1>, the compiler will:

  • Check whether Table<T> has an OrderBy method - nope
  • Check whether any of the base classes or interfaces it implements has an OrderBy method - nope
  • Start looking at extension methods

It will find both Queryable.OrderBy and Enumerable.OrderBy as extension methods which match the target type, but the Queryable.OrderBy method isn't applicable, so it uses Enumerable.OrderBy instead.

So you can think of it as if the compiler has rewritten your code into:

List<Table1> t = Enumerable.ToList(Enumerable.OrderBy(db.Table1, f));

Now at execution time, Enumerable.OrderBy will iterate over its source (db.Table1) and perform the appropriate ordering based on the key extraction function. (Strictly speaking, it will immediately return an IEnumerable<T> which will iterate over the source when it's asked for the first result.)

Avian answered 31/8, 2012 at 20:24 Comment(2)
Same remark here, of course, fine and clear, thx.And same question : isn't this kind of interface inheritance design a little bit "dangerous" ? I mean, everything compiles perfectly, acts perfectly, but... behind, that's not what you meant. We can say that you should know your msdn, but you could easily make wrong things, no ?Wildon
@RaphaëlAlthaus See my new answer for that. Jon Skeet just answered what you asked, and you didn't ask (originally) why "source.OrderBy(x => x.Name)" chooses the IQueryable<> method and not the IEnumerable<> one.Lytta
I
2

The queryable returns all records (hence no WHERE clause in the SQL statement), and then the Func is applied to the objects in the client's memory, through Enumerable.OrderBy. More specifically, the OrderBy call resolves to Enumerable.OrderBy because the parameter is a Func. You can therefore rewrite the statement using static method call syntax, to make it a bit clearer what's going on:

Func<Table1, string> f = x => x.Name; 
var t = Enumerable.OrderBy(db.Table1, f).ToList(); 

The end result is that the sort specified by OrderBy is done by the client process rather than by the database server.

Ives answered 31/8, 2012 at 20:24 Comment(0)
L
2

This answer is a kind of comment to Andras Zoltan's answer (but this is too long to fit in the comment format).

Zoltan's answer is interesting and mostly correct, except the phrase C# sees the lambda as an Expression<> first and foremost [...].

C# sees a lambda (and any anonymous function) as equally "close" to a delegate and the Expression<> (expression tree) of that same delegate. According to the C# specification, neither is a "better conversion target".

So consider this code:

class C
{
    public void Overloaded(Expression<Func<int, int>> e)
    {
        Console.WriteLine("expression tree");
    }
    public void Overloaded(Func<int, int> d)
    {
        Console.WriteLine("delegate");
    }
}

Then:

var c = new C();
c.Overloaded(i => i + 1);   // will not compile! "The call is ambiguous ..."

So the reason why it works with IQueryable<> is something else. The method defined by the direct interface type is preferred over the method defined in the base interface.

To illustrate, change the above code to this:

interface IBase
{
    void Overloaded(Expression<Func<int, int>> e);
}
interface IDerived : IBase
{
    void Overloaded(Func<int, int> d);
}
class C : IDerived
{
    public void Overloaded(Expression<Func<int, int>> e)
    {
        Console.WriteLine("expression tree");
    }
    public void Overloaded(Func<int, int> d)
    {
        Console.WriteLine("delegate");
    }
}

Then:

IDerived x = new C();
x.Overloaded(i => i + 1);  // compiles! At runtime, writes "delegate" to the console

As you see, the member defined in IDerived is chosen, not the one defined in IBase. Note that I reversed the situation (compared to IQueryable<>) so in my example the delegate overload is defined in the most derived interface and is therefore preferred over the expression tree overload.

Note: In the IQueryable<> case the OrderBy methods in question are not ordinary instance methods. Instead, one is an extension method to the derived interface, and the other is an extension method to the base interface. But the explanation is similar.

Lytta answered 2/9, 2012 at 15:9 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.