Automatically Compile Linq Queries
Asked Answered
W

4

36

We've found that compiling our Linq queries is much, much faster than them having to compile each time, so we would like to start using compiled queries. The problem is that it makes code harder to read, because the actual syntax of the query is off in some other file, away from where it's being used.

It occurred to me that it might be possible to write a method (or extension method) that uses reflection to determine what queries are being passed in and cache the compiled versions automatically for use in the future.

var foo = (from f in db.Foo where f.ix == bar select f).Cached();

Cached() would have to reflect the query object passed in and determine the table(s) selected on and the parameter types for the query. Obviously, reflection is a bit slow, so it might be better to use names for the cache object (but you'd still have to use reflection the first time to compile the query).

var foo = (from f in db.Foo where f.ix == bar select f).Cached("Foo.ix");

Does anyone have any experience with doing this, or know if it's even possible?

UPDATE: For those who have not seen it, you can compile LINQ queries to SQL with the following code:

public static class MyCompiledQueries
{
    public static Func<DataContext, int, IQueryable<Foo>> getFoo =
        CompiledQuery.Compile(
            (DataContext db, int ixFoo) => (from f in db.Foo
                                            where f.ix == ixFoo
                                            select f)
        );
}

What I am trying to do is have a cache of these Func<> objects that I can call into after automatically compiling the query the first time around.

Wean answered 3/8, 2009 at 18:7 Comment(5)
This is a confusing question because you seem to be conflating LINQ and LINQ to SQL (which additionally generates, compiles and caches execution plans behind the scenes every time a query's run). If you're asking about SQL Server's compiled execution plans, there's no way (that I'm aware of) to compile them and keep them cached other than running them.Famagusta
This has nothing to do with SQL Server. LINQ to SQL compiles queries -- which can take quite a while -- from both of the LINQ syntaxes (chaining or SQL-style) to SQL each time those queries are run. Read the link at the top to learn more.Wean
One problem I have found with using compiled queries with L2S in a web app is that to compile it you need to pass it the instance of the DataContext - for a web app this means that you need one shared DataContext for the entire site - which in return caused me some major multi threaded issues when the site started having a big load. I really really dislike how you have to pass the datacontext instance in when compiling the query...Saxon
You don't pass in a DataContext when compiling the query. See my answer below; you actually pass in a delegate that takes a DataContext (and other parameters) and returns an IQueryable<T>.Familial
Did you ever implement this path? If so, how did you deal with differing DataLoadOptions?Manana
F
18

You can't have extension methods invoked on anonymous lambda expressions, so you'll want to use a Cache class. In order to properly cache a query you'll also need to 'lift' any parameters (including your DataContext) into parameters for your lambda expression. This results in very verbose usage like:

var results = QueryCache.Cache((MyModelDataContext db) => 
    from x in db.Foo where !x.IsDisabled select x);

In order to clean that up, we can instantiate a QueryCache on a per-context basis if we make it non-static:

public class FooRepository
{
    readonly QueryCache<MyModelDataContext> q = 
        new QueryCache<MyModelDataContext>(new MyModelDataContext());
}

Then we can write a Cache method that will enable us to write the following:

var results = q.Cache(db => from x in db.Foo where !x.IsDisabled select x);

Any arguments in your query will also need to be lifted:

var results = q.Cache((db, bar) => 
    from x in db.Foo where x.id != bar select x, localBarValue);

Here's the QueryCache implementation I mocked up:

public class QueryCache<TContext> where TContext : DataContext
{
    private readonly TContext db;
    public QueryCache(TContext db)
    {
        this.db = db;
    }

    private static readonly Dictionary<string, Delegate> cache = new Dictionary<string, Delegate>();

    public IQueryable<T> Cache<T>(Expression<Func<TContext, IQueryable<T>>> q)
    {
        string key = q.ToString();
        Delegate result;
        lock (cache) if (!cache.TryGetValue(key, out result))
        {
            result = cache[key] = CompiledQuery.Compile(q);
        }
        return ((Func<TContext, IQueryable<T>>)result)(db);
    }

    public IQueryable<T> Cache<T, TArg1>(Expression<Func<TContext, TArg1, IQueryable<T>>> q, TArg1 param1)
    {
        string key = q.ToString();
        Delegate result;
        lock (cache) if (!cache.TryGetValue(key, out result))
        {
            result = cache[key] = CompiledQuery.Compile(q);
        }
        return ((Func<TContext, TArg1, IQueryable<T>>)result)(db, param1);
    }

    public IQueryable<T> Cache<T, TArg1, TArg2>(Expression<Func<TContext, TArg1, TArg2, IQueryable<T>>> q, TArg1 param1, TArg2 param2)
    {
        string key = q.ToString();
        Delegate result;
        lock (cache) if (!cache.TryGetValue(key, out result))
        {
            result = cache[key] = CompiledQuery.Compile(q);
        }
        return ((Func<TContext, TArg1, TArg2, IQueryable<T>>)result)(db, param1, param2);
    }
}

This can be extended to support more arguments. The great bit is that by passing the parameter values into the Cache method itself, you get implicit typing for the lambda expression.

EDIT: Note that you cannot apply new operators to the compiled queries.. Specifically you cannot do something like this:

var allresults = q.Cache(db => from f in db.Foo select f);
var page = allresults.Skip(currentPage * pageSize).Take(pageSize);

So if you plan on paging a query, you need to do it in the compile operation instead of doing it later. This is necessary not only to avoid an exception, but also in keeping with the whole point of Skip/Take (to avoid returning all rows from the database). This pattern would work:

public IQueryable<Foo> GetFooPaged(int currentPage, int pageSize)
{
    return q.Cache((db, cur, size) => (from f in db.Foo select f)
        .Skip(cur*size).Take(size), currentPage, pageSize);
}

Another approach to paging would be to return a Func:

public Func<int, int, IQueryable<Foo>> GetPageableFoo()
{
    return (cur, size) => q.Cache((db, c, s) => (from f in db.foo select f)
        .Skip(c*s).Take(s), c, s);
}

This pattern is used like:

var results = GetPageableFoo()(currentPage, pageSize);
Familial answered 4/8, 2009 at 11:16 Comment(5)
This is almost exactly the same as I've started working up. The only problem I see is that calling q.ToString() would cause the query to be compiled anyways, since ToString() outputs the parametrized SQL. Am I missing something?Wean
be careful with .ToString() if you change the variable names but LINQ expression is the same, it will change the ToString and therefore the key will be different. It will therefore compile a new query.Brassard
@tghw: .ToString() is not a problem; it stringifies the lambda expression and not the resulting SQL, i.e. "db => db.Foo.Where(x => !x.IsDisabled)". I verified this in locally in an MVC project. @Stan: that's not a real concern since you'll likely have N literal queries in your code versus M*N times those queries are called.Familial
Are you calling ToString on the compiled IQueryable or the original IQueryable? I have test code here that calls into the QueryCache much like the example code I have above, and the key local var is set to a literal interpretation of the LINQ query with no SQL, like my comment above. Set a breakpoint on the lock lines in QueryCache and see what you get.Familial
-1 ToString() is not enough. The same WHERE condition over 2 different table, returns the same string representation. For example: {(x, y) => x.GetTable().Where(ag => (ag.Code = y)).FirstOrDefault()} may represent 2 queries on different tables with the same "Code = 'xxx'" condition.Sessoms
C
3

I had to deal with saving a > 15y/o project that was developed using LinqToSql and was too CPU hungry.

Benchmarking showed that using compiled query is x7 faster for complex queries, and x2 for simple queries (considering that the running the query itself is negligible, here it's just about the throughput of compiling the query).

Caching is NOT done automatically by .Net Framework (no matter what version), this only happens for Entity Framework NOT for LINQ-TO-SQL, and these are different technologies.

Usage of compiled queries is tricky, so here are two important highlights:

  • You MUST compile que query including the materialization instructions (FirstOrDefault/First/Any/Take/Skip/ToList), otherwise you risk bringing your whole database into memory: LINQ to SQL *compiled* queries and when they execute
  • You cannot DOUBLE iterate on a compiled query's result (if it's an IQueryable), but this is basically solved once you properly consider the previous point

Considering that, I came up with this cache class. Using the static approach as proposed in other comments has some maintainability drawbacks - it's mainly less readable -, plus it is harder to migrate an existing huge codebase.

                LinqQueryCache<VCDataClasses>
                    .KeyFromQuery()
                    .Cache(
                        dcs.CurrentContext, 
                        (ctx, courseId) => 
                            (from p in ctx.COURSEs where p.COURSEID == courseId select p).FirstOrDefault(), 
                        5);

On very tight loops, using a cache key from the callee instead of the query itself yielded +10% better performance:

                LinqQueryCache<VCDataClasses>
                    .KeyFromStack()
                    .Cache(
                        dcs.CurrentContext, 
                        (ctx, courseId) => 
                            (from p in ctx.COURSEs where p.COURSEID == courseId select p).FirstOrDefault(), 
                        5);

And here is the code. The cache prevents the coder from returning an IQueryable in a compiled query, just for safety.

public class LinqQueryCache<TContext>
        where TContext : DataContext
    {
        protected static readonly ConcurrentDictionary<string, Delegate> CacheValue = new ConcurrentDictionary<string, Delegate>();

        protected string KeyValue = null;

        protected string Key
        {
            get => this.KeyValue;

            set
            {
                if (this.KeyValue != null)
                {
                    throw new Exception("This object cannot be reused for another key.");
                }

                this.KeyValue = value;
            }
        }

        private LinqQueryCache(string key)
        {
            this.Key = key;
        }

        public static LinqQueryCache<TContext> KeyFromStack(
            [System.Runtime.CompilerServices.CallerFilePath] string sourceFilePath = "",
            [System.Runtime.CompilerServices.CallerLineNumber] int sourceLineNumber = 0)
        {
            return new LinqQueryCache<TContext>(Encryption.GetMd5(sourceFilePath + "::" + sourceLineNumber));
        }

        public static LinqQueryCache<TContext> KeyFromQuery()
        {
            return new LinqQueryCache<TContext>(null);
        }

        public T Cache<T>(TContext db, Expression<Func<TContext, T>> q)
        {
            if (Debugger.IsAttached && typeof(T).IsAssignableFrom(typeof(IQueryable)))
            {
                throw new Exception("Cannot compiled queries with an IQueryableResult");
            }

            if (this.Key == null)
            {
                this.Key = q.ToString();
            }

            if (!CacheValue.TryGetValue(this.Key, out var result))
            {
                result = CompiledQuery.Compile(q);
                CacheValue.TryAdd(this.Key, result);
            }

            return ((Func<TContext, T>)result)(db);
        }

        public T Cache<T, TArg1>(TContext db, Expression<Func<TContext, TArg1, T>> q, TArg1 param1)
        {
            if (Debugger.IsAttached && typeof(T).IsAssignableFrom(typeof(IQueryable)))
            {
                throw new Exception("Cannot compiled queries with an IQueryableResult");
            }

            if (this.Key == null)
            {
                this.Key = q.ToString();
            }

            if (!CacheValue.TryGetValue(this.Key, out var result))
            {
                result = CompiledQuery.Compile(q);
                CacheValue.TryAdd(this.Key, result);
            }

            return ((Func<TContext, TArg1, T>)result)(db, param1);
        }
    }
Cheloid answered 1/5, 2020 at 6:9 Comment(0)
B
2

Since nobody is attempting, I'll give it a shot. Maybe we can both work this out somehow. Here is my attempt at this.

I set this up using a dictionary, I am also not using DataContext although this is trivial i believe.

public static class CompiledExtensions
    {
        private static Dictionary<string, object> _dictionary = new Dictionary<string, object>();

        public static IEnumerable<TResult> Cache<TArg, TResult>(this IEnumerable<TArg> list, string name, Expression<Func<IEnumerable<TArg>, IEnumerable<TResult>>> expression)
        {
            Func<IEnumerable<TArg>,IEnumerable<TResult>> _pointer;

            if (_dictionary.ContainsKey(name))
            {
                _pointer = _dictionary[name] as Func<IEnumerable<TArg>, IEnumerable<TResult>>;
            }
            else
            {
                _pointer = expression.Compile();
                _dictionary.Add(name, _pointer as object);
            }

            IEnumerable<TResult> result;
            result = _pointer(list);

            return result;
        }
    }

now this allows me to do this

  List<string> list = typeof(string).GetMethods().Select(x => x.Name).ToList();

  IEnumerable<string> results = list.Cache("To",x => x.Where( y => y.Contains("To")));
  IEnumerable<string> cachedResult = list.Cache("To", x => x.Where(y => y.Contains("To")));
  IEnumerable<string> anotherCachedResult = list.Cache("To", x => from item in x where item.Contains("To") select item);

looking forward to some discussion about this, to further develop this idea.

Brassard answered 3/8, 2009 at 21:20 Comment(5)
default(IEnumerable<TResult>); == null; in all cases. You're defaulting an Interface.Brie
oops..honest mistake. I had different code there and I just copied and pasted my code and didn't double check it. thanks for noticing.Brassard
This does not do what you think it does. If you have an Expression on an IEnumerable there's no difference between expression.Compile() and the actual Func IL code that would have been generated if you weren't asking for an Expression. In fact, exp.Compile() will most likely be slower since you're missing a lot of compiler optimizations.Familial
To clarify my previous comment, var results = list.Cache("To", x => x.Where(y => y.Contains("To"))) will be slower than simply calling var results = list.Where(y => y.Contains("To")).Familial
@Jason: you could use CompiledQuery.Compile instead of expression.Compile, of course that would require to refactor the code to take Func<TArg:DataContext,TResult> and would require you to pass the DataContext to the method. Because of the amount of refactoring that would need to be done, im going to leave my answer as is.Brassard
R
1

For future posterity : .NET Framework 4.5 will do this by default (according to a slide in a presentation I just watched).

Robbyrobbyn answered 16/10, 2011 at 18:24 Comment(4)
Could you provide an appropriate source for this statement?Pilliwinks
Sorry I'm not sure why I didn't provide a source at the time-I assume it was something I had watched recently at the time and didn't have the link handyRobbyrobbyn
Ok, is it true what you said at the time being? Are all LINQ queries compiled in .NET 4.5?Pilliwinks
I hope so :) especially if you are the Nick N I used to work with. Can't verify right now but it seems reasonable unless they withdrew the featureRobbyrobbyn

© 2022 - 2024 — McMap. All rights reserved.