Why don't non-capturing expression trees that are initialized using lambda expressions get cached?
Asked Answered
B

2

16

Consider the following class:

class Program
{
    static void Test()
    {
        TestDelegate<string, int>(s => s.Length);

        TestExpressionTree<string, int>(s => s.Length);
    }

    static void TestDelegate<T1, T2>(Func<T1, T2> del) { /*...*/ }

    static void TestExpressionTree<T1, T2>(Expression<Func<T1, T2>> exp) { /*...*/ }
}

This is what the compiler generates (in a slightly less readable way):

class Program
{
    static void Test()
    {
        // The delegate call:
        TestDelegate(Cache.Func ?? (Cache.Func = Cache.Instance.FuncImpl));

        // The expression call:
        var paramExp = Expression.Parameter(typeof(string), "s");
        var propExp = Expression.Property(paramExp, "Length");
        var lambdaExp = Expression.Lambda<Func<string, int>>(propExp, paramExp);
        TestExpressionTree(lambdaExp);
    }

    static void TestDelegate<T1, T2>(Func<T1, T2> del) { /*...*/ }

    static void TestExpressionTree<T1, T2>(Expression<Func<T1, T2>> exp) { /*...*/ }

    sealed class Cache
    {
        public static readonly Cache Instance = new Cache();

        public static Func<string, int> Func;

        internal int FuncImpl(string s) => s.Length;
    }
}

This way the delegate passed with the first call is initialized once and reused on multiple Test calls.

However, the expression tree passed with the second call is not reused - a new lambda expression is initialized on each Test call.

Provided it doesn't capture anything and expression trees being immutable, what would be the problem with caching the expression tree as well?

Edit

I think I need to clarify why I think the expression trees are fit to be cached.

  1. The resulting expression tree is known at the compilation time (well, it is created by the compiler).
  2. They are immutable. So, unlike the array example given by X39 below, an expression tree can't be modified after it's initialized and therefore, is safe to be cached.
  3. There can be only so many expression trees in a code-base - Again, I'm talking about the ones that can be cached, i.e. the ones that are initialized using lambda expressions (not the ones that are created manually) without capturing any outside state/variable. Auto-interning of string literals would be a similar example.
  4. They are meant to be traversed - They can be compiled to create a delegate, but that's not their main function. If someone wants a compiled delegate, they can just accept one (a Func<T>, instead of an Expression<Func<T>>). Accepting an expression tree indicates that it's going to be used as a data structure. So, "they should be compiled first" is not a sensible argument against caching of expression trees.

What I'm asking is the potential drawbacks of caching these expression trees. Memory requirements mentioned by svick is a more likely example.

Boult answered 13/10, 2018 at 12:48 Comment(3)
Caching also has its costs (increased memory usage). And since the code that handles the expression tree is likely going to be more expensive (in terms of both time and allocations) than the code generated by the compiler here, it's not clear to me this optimization would actually be an improvement.Coffeecolored
Expressions not only used for method execution. Expression should be complied first and then can be run as a function. Assume that expression is just another instance of some class then it depends what to do with it to cashe or not to casheCharis
Only C# language designers can answer the why part. I agree with your points and also would like them to be cached. But I guess the answer is much more trivial and non technical - like the beginning of this answer by Eric Lippert and the associate blog post (too bad he's not working for MS anymore). Shortly, the typical ballance between time, resources, costs, priorities and benefits for implementing a feature :)Apoplexy
S
9

Why don't non-capturing expression trees that are initialized using lambda expressions get cached?

I wrote that code in the compiler, both in the original C# 3 implementation and the Roslyn rewrite.

As I always say when asked a "why not" question: compiler writers are not required to provide a reason why they did not do something. Doing something takes work, requires effort, and costs money. The default position therefore is always to not do something when work is unnecessary.

Rather, the person who wants the work done needs to justify why that work is worth the cost. And actually, the requirement is stronger than that. The person who wants the work done is required to justify why the unnecessary work is a better way to spend time, effort and money than any other possible use of the developer's time. There are literally an infinite number of ways to improve the compiler's performance, feature set, robustness, usability, and so on. What makes this one so great?

Now, whenever I give this explanation, I get pushback saying "Microsoft is rich, blah blah blah". Having a lot of resources is not the same as having infinite resources, and the compiler is already extremely expensive. I also get pushback saying "open source makes labour free", which it absolutely does not.

I noted that time was a factor. It may be helpful to expand on that further.

When C# 3.0 was being developed, Visual Studio had a specific date upon which it would be "released to manufacturing", a quaint term from the time when software was distributed mostly on CDROMs that could not be changed once they were printed. This date was not arbitrary; rather, there was a whole chain of dependencies that followed it. If, say, SQL Server had a feature that depended on LINQ, it would not make any sense to delay the VS release until after that year's SQL Server release, and so the VS schedule affected the SQL Server schedule, which in turn affected other team's schedules, and so on.

Therefore, every team in the VS organization submitted a schedule, and the team with the most days work on that schedule was the "long pole". The C# team was the long pole for VS, and I was the long pole for the C# compiler team, so every day that I was late in delivering my compiler features was a day that Visual Studio, and every down stream product would slip its schedule and disappoint its customers.

This is a powerful disincentive to doing unnecessary performance work, particularly performance work that might make things worse, not better. A cache without an expiry policy has a name: it is a memory leak.

As you note, anonymous functions are cached. When I implemented lambdas, I used the same infrastructure code as anonymous functions, so that cache was (1) "sunk cost" -- the work was already done and it would have been more work to turn it off than leave it on, and (2) had already been tested and vetted by my predecessors.

I considered implementing a similar cache on expression trees, using the same logic, but realized that this would (1) be work, which requires time, which I was already short on, and (2) I had no idea what the performance impacts would be of caching such an object. Delegates are really small. Delegates are a single object; if the delegate is logically static, which the ones that C# caches aggressively are, it doesn't even contain a reference to the receiver. Expression trees, by contrast, are potentially huge trees. They're a graph of small objects, but that graph is potentially large. Graphs of objects make more work for the garbage collector the longer they live!

Therefore, whatever performance tests and metrics had been used to justify the decision to cache delegates would not be applicable to expression trees, since the memory burdens were completely different. I did not want to create a new source of memory leaks in our most important new language feature. The risk was too high.

But a risk might be worth it if the benefit is large. So what's the benefit? Start by asking yourself "where are expression trees used?" In LINQ queries that are going to be remoted to databases. This is a super expensive operation in both time and memory. Adding a cache doesn't get you a big win because the work you're about to do is millions of times more expensive than the win; the win is noise.

Compare that to the performance win for delegates. The difference between "allocate x => x + 1, then call it" a million times and "check the cache, if it is not cached allocate it, call it" is trading an allocation for a check, which could save you entire nanoseconds. That seems like no big deal, but the call will also take nanoseconds, so on a percentage basis, it's significant. Caching delegates is a clear win. Caching expression trees is not anywhere close to a clear win; we'd need data that it is a benefit that justifies the risk.

Therefore it was an easy decision to make to not spend any time on this unnecessary, likely unnoticeable, unimportant optimization in C# 3.

During C# 4, we had many more important things to do than revisit this decision.

After C# 4, the team divided into two sub-teams, one to rewrite the compiler, "Roslyn", and the other to implement async-await in the original compiler codebase. The async-await team was entirely consumed by implementing that complex and difficult feature, and of course the team was smaller than usual. And they knew that all their work was eventually going to be replicated in Roslyn and then thrown away; that compiler was at the end of its life. So there was no incentive to spending time or effort to add optimizations.

The proposed optimization was on my list of things to consider when I rewrote the code in Roslyn, but our highest priority was getting the compiler working end-to-end before we optimized small parts of it, and I left Microsoft in 2012, before that work was finished.

As for why none of my coworkers revisited this issue after I left, you'd have to ask them, but I am pretty sure they were very busy doing real work on real features that were requested by real customers, or on performance optimizations that had bigger wins for smaller cost. That work included open-sourcing the compiler, which is not cheap.

So, if you want this work done, you have some choices.

  • The compiler is open sourced; you could do it yourself. If that sounds like a lot of work for very little benefit to you, then you now have a more intuitive understanding of why no one has done this work since the feature was implemented in 2005.

Of course, this is still not "free" to the compiler team. Someone would have to spend time and effort and money reviewing your work. Remember, most of the cost of a performance optimization is not the five minutes it takes to change the code. It's the weeks of testing under a sample of all possible real-world conditions that demonstrate that the optimization works and does not make things worse! Performance work is the most expensive work I do.

  • The design process is open. Enter an issue and in that issue, give a compelling reason why you think this enhancement is worth it. With data.

So far all you've said is why it is possible. Possible doesn't cut it! Lots of things are possible. Give us numbers that justify why the compiler devs ought to be spending their time on making this enhancement rather than implementing new features requested by customers.

The actual win in avoiding repeated allocations of complex expression trees is avoiding collection pressure, and this is a serious concern. Many features in C# are designed to avoid collection pressure, and expression trees are NOT one of them. My advice to you if you want this optimization is to concentrate on its impact on pressure, because that's where you'll find the biggest win and be able to make the most compelling argument.

Shonda answered 2/11, 2018 at 13:24 Comment(7)
This also leaves out the very important point that it's pretty trivial for a programmer to perform the optimization themselves in a given actual program (by simply storing the expression in a field rather than having it be in line) if the developer is in a situation in which they're confident that it's a performance win to cache the expression. Compiler optimizations that are most valuable are ones that are either hard to implement, or not possible to implement from the user code.Cockayne
@Servy: That is a great point that I considered making; the pushback on that point is that expression trees are commonly used in query comprehension expressions, and in those it is genuinely inconvenient for the developer to write a cache. You'd have to rewrite the query into its fluent form, and then pull the lambdas into individual declarations. And if those lambdas involve anonymous types, then there are type inference problems. So this is a case where the compiler can do a much better job than the developer.Shonda
@Servy: A point that is seldom mentioned but also worth considering is to go the other way. Many compilers do not give you fine-grained control for turning off optimizations that you do not want. If you discovered, say, that excessive delegate caching was causing a problem, there's no way to tell the C# compiler to cut it out. I've recently been looking at cpython's optimizer, and surprisingly there's no way to tell it to stop doing peephole optimizations, even though that screws up code coverage tools. And so on.Shonda
What is collection pressure, if I'm not being so dull?Felt
@BurakKarakuş: Garbage collection is expensive and disruptive, so we only want to do it when you're likely to collect a lot of garbage. The GC keeps track of "pressure", which is an abstraction over the idea of "how strongly does the GC feel like it should be collecting soon?" Allocating a lot of small objects increases pressure, because those objects are likely to be dead soon. Increasing pressure increases the likelihood that there will be a collection, which can kill performance, so you want to be mindful of pressure in high-performance applications.Shonda
@BurakKarakuş: The proposed optimization is particularly interesting with regards to pressure, because it decreases the number of allocations of small objects, which decreases pressure, but it increases the number of small, long-lived objects, which increases the cost of each collection. That's why it is totally not clear what the impact of this optimization would be on performance, because we are now being pulled in two different directions: fewer collections, but each is more expensive. It's not at all obvious that's a win!Shonda
While I'm sad that expression trees aren't cached, I concur that the gain-to-complexity ratio isn't clear. The only thing I'd like to note is that expression trees aren't used only for db access: a whole side of metaprogramming in C# was born thanks to expression trees (and especially before the introduction of the nameof). The possibility of passing forward the "name" of fields/properties without using strings was very very very powerful. Nowadays thanks to the nameof this is a little less important.Grande
D
0

Compiler does what it always is doing, not caching whatever you feed into it.

To realize that this is always happening, look into passing a new array to your method.

this.DoSomethingWithArray(new string[] {"foo","bar" });

will get to

IL_0001: ldarg.0
IL_0002: ldc.i4.2
IL_0003: newarr    [mscorlib]System.String
IL_0008: dup
IL_0009: ldc.i4.0
IL_000A: ldstr     "foo"
IL_000F: stelem.ref
IL_0010: dup
IL_0011: ldc.i4.1
IL_0012: ldstr     "bar"
IL_0017: stelem.ref
IL_0018: call      instance void Test::DoSomethingWithArray(string[])

instead of caching the array once and reusing it everytime.

Same applies more or less to Expressions, just that here the compiler is doing the handy work of generating your tree for you meaning that in the end you are expected to know when caching is needed and apply it accordingly.

To get a cached version, use something like this:

private static System.Linq.Expressions.Expression<Func<object, string>> Exp = (obj) => obj.ToString();
Duhl answered 17/10, 2018 at 10:35 Comment(2)
Arrays cannot be cached because they can be mutated, and then you'd be passing around a mutated array instead of the values you wanted. Expression trees can be cached because they are immutable.Shonda
Also, C# does cache arrays in a sense, if the array is made of constant integers. Try instead doing int[] x = {10, 20, 30}; and look at the generated code. C# generates the memory image of the array at compile time, and then creates a new array that copies that image, so that it does not have to spend time copying each individual element into the array, as in your example with strings.Shonda

© 2022 - 2024 — McMap. All rights reserved.