How do I create an expression tree calling IEnumerable<TSource>.Any(...)?
Asked Answered
C

2

43

I am trying to create an expression tree that represents the following:

myObject.childObjectCollection.Any(i => i.Name == "name");

Shortened for clarity, I have the following:

//'myObject.childObjectCollection' is represented here by 'propertyExp'
//'i => i.Name == "name"' is represented here by 'predicateExp'
//but I am struggling with the Any() method reference - if I make the parent method
//non-generic Expression.Call() fails but, as per below, if i use <T> the 
//MethodInfo object is always null - I can't get a reference to it

private static MethodCallExpression GetAnyExpression<T>(MemberExpression propertyExp, Expression predicateExp)
{
    MethodInfo method = typeof(Enumerable).GetMethod("Any", new[]{ typeof(Func<IEnumerable<T>, Boolean>)});
    return Expression.Call(propertyExp, method, predicateExp);
}

What am I doing wrong? Anyone have any suggestions?

Courlan answered 28/11, 2008 at 17:55 Comment(1)
Core of the problem: #270078Hardan
O
88

There are several things wrong with how you're going about it.

  1. You're mixing abstraction levels. The T parameter to GetAnyExpression<T> could be different to the type parameter used to instantiate propertyExp.Type. The T type parameter is one step closer in the abstraction stack to compile time - unless you're calling GetAnyExpression<T> via reflection, it will be determined at compile time - but the type embedded in the expression passed as propertyExp is determined at runtime. Your passing of the predicate as an Expression is also an abstraction mixup - which is the next point.

  2. The predicate you are passing to GetAnyExpression should be a delegate value, not an Expression of any kind, since you're trying to call Enumerable.Any<T>. If you were trying to call an expression-tree version of Any, then you ought to pass a LambdaExpression instead, which you would be quoting, and is one of the rare cases where you might be justified in passing a more specific type than Expression, which leads me to my next point.

  3. In general, you should pass around Expression values. When working with expression trees in general - and this applies across all kinds of compilers, not just LINQ and its friends - you should do so in a way that's agnostic as to the immediate composition of the node tree you're working with. You are presuming that you're calling Any on a MemberExpression, but you don't actually need to know that you're dealing with a MemberExpression, just an Expression of type some instantiation of IEnumerable<>. This is a common mistake for people not familiar with the basics of compiler ASTs. Frans Bouma repeatedly made the same mistake when he first started working with expression trees - thinking in special cases. Think generally. You'll save yourself a lot of hassle in the medium and longer term.

  4. And here comes the meat of your problem (though the second and probably first issues would have bit you if you had gotten past it) - you need to find the appropriate generic overload of the Any method, and then instantiate it with the correct type. Reflection doesn't provide you with an easy out here; you need to iterate through and find an appropriate version.

So, breaking it down: you need to find a generic method (Any). Here's a utility function that does that:

static MethodBase GetGenericMethod(Type type, string name, Type[] typeArgs, 
    Type[] argTypes, BindingFlags flags)
{
    int typeArity = typeArgs.Length;
    var methods = type.GetMethods()
        .Where(m => m.Name == name)
        .Where(m => m.GetGenericArguments().Length == typeArity)
        .Select(m => m.MakeGenericMethod(typeArgs));

    return Type.DefaultBinder.SelectMethod(flags, methods.ToArray(), argTypes, null);
}

However, it requires the type arguments and the correct argument types. Getting that from your propertyExp Expression isn't entirely trivial, because the Expression may be of a List<T> type, or some other type, but we need to find the IEnumerable<T> instantiation and get its type argument. I've encapsulated that into a couple of functions:

static bool IsIEnumerable(Type type)
{
    return type.IsGenericType
        && type.GetGenericTypeDefinition() == typeof(IEnumerable<>);
}

static Type GetIEnumerableImpl(Type type)
{
    // Get IEnumerable implementation. Either type is IEnumerable<T> for some T, 
    // or it implements IEnumerable<T> for some T. We need to find the interface.
    if (IsIEnumerable(type))
        return type;
    Type[] t = type.FindInterfaces((m, o) => IsIEnumerable(m), null);
    Debug.Assert(t.Length == 1);
    return t[0];
}

So, given any Type, we can now pull the IEnumerable<T> instantiation out of it - and assert if there isn't (exactly) one.

With that work out of the way, solving the real problem isn't too difficult. I've renamed your method to CallAny, and changed the parameter types as suggested:

static Expression CallAny(Expression collection, Delegate predicate)
{
    Type cType = GetIEnumerableImpl(collection.Type);
    collection = Expression.Convert(collection, cType);

    Type elemType = cType.GetGenericArguments()[0];
    Type predType = typeof(Func<,>).MakeGenericType(elemType, typeof(bool));

    // Enumerable.Any<T>(IEnumerable<T>, Func<T,bool>)
    MethodInfo anyMethod = (MethodInfo)
        GetGenericMethod(typeof(Enumerable), "Any", new[] { elemType }, 
            new[] { cType, predType }, BindingFlags.Static);

    return Expression.Call(
        anyMethod,
            collection,
            Expression.Constant(predicate));
}

Here's a Main() routine which uses all the above code and verifies that it works for a trivial case:

static void Main()
{
    // sample
    List<string> strings = new List<string> { "foo", "bar", "baz" };

    // Trivial predicate: x => x.StartsWith("b")
    ParameterExpression p = Expression.Parameter(typeof(string), "item");
    Delegate predicate = Expression.Lambda(
        Expression.Call(
            p,
            typeof(string).GetMethod("StartsWith", new[] { typeof(string) }),
            Expression.Constant("b")),
        p).Compile();

    Expression anyCall = CallAny(
        Expression.Constant(strings),
        predicate);

    // now test it.
    Func<bool> a = (Func<bool>) Expression.Lambda(anyCall).Compile();
    Console.WriteLine("Found? {0}", a());
    Console.ReadLine();
}
Overman answered 28/11, 2008 at 19:32 Comment(1)
Barry - I really appreciate you taking the time to explain all of that to me, many thanks, I will give it a crack over the weekend :)Courlan
S
20

Barry's answer provides a working solution to the question posed by the original poster. Thanks to both of those individuals for asking and answering.

I found this thread as I was trying to devise a solution to a quite similar problem: programmatically creating an expression tree that includes a call to the Any() method. As an additional constraint, however, the ultimate goal of my solution was to pass such a dynamically-created expression through Linq-to-SQL so that the work of the Any() evaluation is actually performed in the DB itself.

Unfortunately, the solution as discussed so far is not something that Linq-to-SQL can handle.

Operating under the assumption that this might be a pretty popular reason for wanting to build a dynamic expression tree, I decided to augment the thread with my findings.

When I attempted to use the result of Barry's CallAny() as an expression in a Linq-to-SQL Where() clause, I received an InvalidOperationException with the following properties:

  • HResult=-2146233079
  • Message="Internal .NET Framework Data Provider error 1025"
  • Source=System.Data.Entity

After comparing a hard-coded expression tree to the dynamically-created one using CallAny(), I found that the core problem was due to the Compile() of the predicate expression and the attempt to invoke the resulting delegate in the CallAny(). Without digging deep into Linq-to-SQL implementation details, it seemed reasonable to me that Linq-to-SQL wouldn't know what to do with such a structure.

Therefore, after some experimentation, I was able to achieve my desired goal by slightly revising the suggested CallAny() implementation to take a predicateExpression rather than a delegate for the Any() predicate logic.

My revised method is:

static Expression CallAny(Expression collection, Expression predicateExpression)
{
    Type cType = GetIEnumerableImpl(collection.Type);
    collection = Expression.Convert(collection, cType); // (see "NOTE" below)

    Type elemType = cType.GetGenericArguments()[0];
    Type predType = typeof(Func<,>).MakeGenericType(elemType, typeof(bool));

    // Enumerable.Any<T>(IEnumerable<T>, Func<T,bool>)
    MethodInfo anyMethod = (MethodInfo)
        GetGenericMethod(typeof(Enumerable), "Any", new[] { elemType }, 
            new[] { cType, predType }, BindingFlags.Static);

    return Expression.Call(
        anyMethod,
        collection,
        predicateExpression);
}

Now I will demonstrate its usage with EF. For clarity I should first show the toy domain model & EF context I am using. Basically my model is a simplistic Blogs & Posts domain ... where a blog has multiple posts and each post has a date:

public class Blog
{
    public int BlogId { get; set; }
    public string Name { get; set; }

    public virtual List<Post> Posts { get; set; }
}

public class Post
{
    public int PostId { get; set; }
    public string Title { get; set; }
    public DateTime Date { get; set; }

    public int BlogId { get; set; }
    public virtual Blog Blog { get; set; }
}

public class BloggingContext : DbContext
{
    public DbSet<Blog> Blogs { get; set; }
    public DbSet<Post> Posts { get; set; }
}

With that domain established, here is my code to ultimately exercise the revised CallAny() and make Linq-to-SQL do the work of evaluating the Any(). My particular example will focus on returning all Blogs which have at least one Post that is newer than a specified cutoff date.

static void Main()
{
    Database.SetInitializer<BloggingContext>(
        new DropCreateDatabaseAlways<BloggingContext>());

    using (var ctx = new BloggingContext())
    {
        // insert some data
        var blog  = new Blog(){Name = "blog"};
        blog.Posts = new List<Post>() 
            { new Post() { Title = "p1", Date = DateTime.Parse("01/01/2001") } };
        blog.Posts = new List<Post>()
            { new Post() { Title = "p2", Date = DateTime.Parse("01/01/2002") } };
        blog.Posts = new List<Post>() 
            { new Post() { Title = "p3", Date = DateTime.Parse("01/01/2003") } };
        ctx.Blogs.Add(blog);

        blog = new Blog() { Name = "blog 2" };
        blog.Posts = new List<Post>()
            { new Post() { Title = "p1", Date = DateTime.Parse("01/01/2001") } };
        ctx.Blogs.Add(blog);
        ctx.SaveChanges();


        // first, do a hard-coded Where() with Any(), to demonstrate that
        // Linq-to-SQL can handle it
        var cutoffDateTime = DateTime.Parse("12/31/2001");
        var hardCodedResult = 
            ctx.Blogs.Where((b) => b.Posts.Any((p) => p.Date > cutoffDateTime));
        var hardCodedResultCount = hardCodedResult.ToList().Count;
        Debug.Assert(hardCodedResultCount > 0);


        // now do a logically equivalent Where() with Any(), but programmatically
        // build the expression tree
        var blogsWithRecentPostsExpression = 
            BuildExpressionForBlogsWithRecentPosts(cutoffDateTime);
        var dynamicExpressionResult = 
            ctx.Blogs.Where(blogsWithRecentPostsExpression);
        var dynamicExpressionResultCount = dynamicExpressionResult.ToList().Count;
        Debug.Assert(dynamicExpressionResultCount > 0);
        Debug.Assert(dynamicExpressionResultCount == hardCodedResultCount);
    }
}

Where BuildExpressionForBlogsWithRecentPosts() is a helper function that uses CallAny() as follows:

private Expression<Func<Blog, Boolean>> BuildExpressionForBlogsWithRecentPosts(
    DateTime cutoffDateTime)
{
    var blogParam = Expression.Parameter(typeof(Blog), "b");
    var postParam = Expression.Parameter(typeof(Post), "p");

    // (p) => p.Date > cutoffDateTime
    var left = Expression.Property(postParam, "Date");
    var right = Expression.Constant(cutoffDateTime);
    var dateGreaterThanCutoffExpression = Expression.GreaterThan(left, right);
    var lambdaForTheAnyCallPredicate = 
        Expression.Lambda<Func<Post, Boolean>>(dateGreaterThanCutoffExpression, 
            postParam);

    // (b) => b.Posts.Any((p) => p.Date > cutoffDateTime))
    var collectionProperty = Expression.Property(blogParam, "Posts");
    var resultExpression = CallAny(collectionProperty, lambdaForTheAnyCallPredicate);
    return Expression.Lambda<Func<Blog, Boolean>>(resultExpression, blogParam);
}

NOTE: I found one other seemingly unimportant delta between the hard-coded and dynamically-built expressions. The dynamically-built one has an "extra" convert call in it that the hard-coded version doesn't seem to have (or need?). The conversion is introduced in the CallAny() implementation. Linq-to-SQL seems to be ok with it so I left it in place (although it was unnecessary). I was not entirely certain if this conversion might be needed in some more robust usages than my toy sample.

Scalage answered 8/8, 2013 at 14:54 Comment(1)
I could have told you that one - it's item (1) in my list of things to do, mixing abstraction levels. The predicate is a runtime value, but Expression is a syntax tree value. Reading the answer after a period of 8 years, I would have factored out the conversion of delegate to Expression in a separate method, or at the main call site. Approximate levels of abstraction being original source -> generic methods -> polymorphic values -> expression-typed values, but the sequence repeats itself inside the language represented by the expression-typed values, turtles all the way down.Overman

© 2022 - 2024 — McMap. All rights reserved.