It's great to ask a question when you don't understand something, but the problem is that it can be hard to know which bit someone doesn't understand. I hope I help here, rather than tell you a bunch of stuff you know, and not actually answer your question.
Let's go back to the days before Linq, before expressions, before lambda, and before even anonymous delegates.
In .NET 1.0 we didn't have any of those. We didn't even have generics. We did though have delegates. And a delegate is related to a function pointer (if you know C, C++ or languages with such) or function as argument/variable (if you know Javascript or languages with such).
We could define a delegate:
public delegate int MyDelegate(double someValue, double someOtherValue);
And then use it as a type for a field, property, variable, method argument or as the basis of an event.
But at the time the only way to actually give a value for a delegate was to refer to an actual method.
public int CompareDoubles(double x, double y)
{
if (x < y) return -1;
return y < x ? 1 : 0;
}
MyDelegate dele = CompareDoubles;
We can invoke that with dele.Invoke(1.0, 2.0)
or the shorthand dele(1.0, 2.0)
.
Now, because we have overloading in .NET, we can have more than one thing that CompareDoubles
refers to. That isn't a problem, because if we also had e.g. public int CompareDoubles(double x, double y, double z){…}
the compiler could know that you could only possibly have meant to assign the other CompareDoubles
to dele
so it's unambiguous. Still, while in the context CompareDoubles
means a method that takes two double
arguments and returns an int
, outside of that context CompareDoubles
means the group of all the methods with that name.
Hence, Method Group which is what we call that.
Now, with .NET 2.0 we got generics, which is useful with delegates, and at the same time in C#2 we got anonymous methods, which is also useful. As of 2.0 we could now do:
MyDelegate dele = delegate (double x, double y)
{
if (x < y) return -1;
return y < x ? 1 : 0;
};
This part was just syntactic sugar from C#2, and behind the scenes there's still a method there, though it has an "unspeakable name" (a name that is valid as a .NET name but not valid as a C# name, so C# names can't clash with it). It was handy if, as was often the case, one was creating methods just to have them used once with a particular delegate though.
Move forward a bit further, and at .NET 3.5 have covariance and contravariance (great with delegates) the Func
and Action
delegates (great for reusing the same name based on type, rather than having a bunch of different delegates which were often very similar) and along with it came C#3 which had lambda expressions.
Now, these are a bit like anonymous methods in one use, but not in another.
That's why we can't do:
var func = (int i) => i * 2;
var
works out what it means from what's been assigned to it, but lamdas work out what they are from what they've been assigned to, so this is ambiguous.
It could mean:
Func<int, int> func = i => i * 2;
In which case it's shorthand for:
Func<int, int> func = delegate(int i){return i * 2;};
Which in turn is shorthand something like for:
int <>SomeNameImpossibleInC# (int i)
{
return i * 2;
}
Func<int, int> func = <>SomeNameImpossibleInC#;
But it can also be used as:
Expression<Func<int, int>> func = i => i * 2;
Which is shorthand for:
Expression<Func<int, int>> func = Expression.Lambda<Func<int, int>>(
Expression.Multiply(
param,
Expression.Constant(2)
),
param
);
And we also with .NET 3.5 have Linq which makes heavy use of both of these. Indeed, Expressions is considered part of Linq and is in the System.Linq.Expressions
namespace. Note that the object we get here is a description of what we want done (take the parameter, multiply it by two, give us the result) not of how to do it.
Now, Linq operates in two main ways. On IQueryable
and IQueryable<T>
and on IEnumerable
and IEnumerable<T>
. The former defines operations to be used on "a provider" with just what "a provider does" being up to that provider, and the latter defines the same operations on in-memory sequences of values.
We can move from one to the other. We can turn an IEnumerable<T>
into an IQueryable<T>
with AsQueryable
which will give us a wrapper on that enumerable, and we can turn the IQueryable<T>
into an IEnumerable<T>
just by treating it as one, because IQueryable<T>
derives from IEnumerable<T>
.
The enumerable form uses the delegates. A simplified version of how Select
works (there are many optimisations this version leaves out, and I'm skipping error checking and in indirection to ensure that error checking happens immediately) would be:
public static IEnumerable<TResult> Select(this IEnumerable<TSource> source, Func<TSource, TResult> selector)
{
foreach(TSource item in source) yield return selector(item);
}
The queryable version on the other hand works by taking the expression tree from the Expression<TSource, TResult>
making it part of an expression that includes the call to Select
, and the source queryable, and returns an object wrapping that expression. So in other words a call to queryable's Select
returns an object that represents a call to queryable's Select
!
Just what is done with that depends on the provider. Database providers turn them into SQL, enumerables call Compile()
on the expression to create a delegate and then we're back at the first version of Select
above, and so on.
But that history considered, let's go backwards through the history again. A lambda can represent either an expression or a delegate (and if an expression, we can Compile()
it to get the same delegate). A delegate is a way of pointing to a method through a variable, and a method is part of a method group. All of this is built on technology which in the first version could only be called by creating a method and then passing that.
Now, lets say we have a method that takes a single argument and has a result.
public string IntString(int num) { return num.ToString(); }
Now lets say we referenced it in a lambda selector:
Enumerable.Range(0, 10).Select(i => IntString(i));
We have a lambda creating an anonymous method for a delegate, and that anonymous method in turn calls a method with the same argument and return types. In a way that's a bit like if we had:
public string MyAnonymousMethod(int i){return IntString(i);}
MyAnonymousMethod
is a bit pointless here; all it does is call IntString(i)
and return the result, so why not just call IntString
in the first place and cut out going through that method:
Enumerable.Range(0, 10).Select(IntString);
We've cut out a needless (though see note below about delegate caching) level of indirection by taking the lambda-based delegate and converting it to a method group. Hence ReSharper's advice "Convert to Method Group" or however it's worded (I don't use ReSharper myself).
There is though something to be careful of here. IQueryable<T>
's Select only takes expressions, so the provider can try to work out how to convert it to its way of doing stuff (e.g. SQL against a database). IEnumerable<T>
's Select only takes delegates so they can be executed in the .NET application itself. We can go from the former to the latter (when the queryable is really a wrapped enumerable) with Compile()
, but we can't go from the latter to the former: We don't have a way of taking a delegate and turning it into an expression that means anything other than "call this delegate" which isn't something that can be turned into SQL.
Now when we use a lambda expression like i => i * 2
it will be an expression when used with IQueryable<T>
and a delegate when used with IEnumerable<T>
due to overload resolution rules favouring the expression with queryable (as a type it can handle both, but the expression form works with the most derived type). If though we explicitly give it a delegate, whether because we typed it somewhere as Func<>
or it comes from a method group, then the overloads taking expressions aren't available and those taking delegates are used. This means it doesn't get passed to the database but rather the linq expression up to that point becomes the "database part" and it gets called and the rest of the work done in memory.
95% of the time that's best avoided. So 95% of the time if you get advice of "convert to method group" with a database-backed query you should think "uh oh! that's actually a delegate. Why is that a delegate? Can I change it to be an expression?". Only the remaining 5% of the time should you think "that'll be slightly shorter if I just pass in the method name". (Also, using a method group instead of a delegate prevents caching of delegates the compiler can do otherwise, so it might be less efficient).
There, I hope I covered the bit that you didn't understand in the course of all that, or at least there's a bit here you can point to and say "that bit there, that's the bit I don't grok".
And it worked! But we do not understand why.
I wish more developers had this attitude. Even the basic fact that you understand that you do not understand puts you in the top 10%. – InvolveusefulExpression.Invoke
this should not compile. Are you missing some piece of code?Expression<Func<SomeDataType, bool>>
does not have anInvoke
member. – Involve