How to "warm-up" Entity Framework? When does it get "cold"?
Asked Answered
C

5

119

No, the answer to my second question is not the winter.

Preface:

I've been doing a lot of research on Entity Framework recently and something that keeps bothering me is its performance when the queries are not warmed-up, so called cold queries.

I went through the performance considerations article for Entity Framework 5.0. The authors introduced the concept of Warm and Cold queries and how they differ, which I also noticed myself without knowing of their existence. Here it's probably worth to mention I only have six months of experience behind my back.

Now I know what topics I can research into additionally if I want to understand the framework better in terms of performance. Unfortunately most of the information on the Internet is outdated or bloated with subjectivity, hence my inability to find any additional information on the Warm vs Cold queries topic.

Basically what I've noticed so far is that whenever I have to recompile or the recycling hits, my initial queries are getting very slow. Any subsequent data read is fast (subjective), as expected.

We'll be migrating to Windows Server 2012, IIS8 and SQL Server 2012 and as a Junior I actually won myself the opportunity to test them before the rest. I'm very happy they introduced a warming-up module that will get my application ready for that first request. However, I'm not sure how to proceed with warming up my Entity Framework.

What I already know is worth doing:

  • Generate my Views in advance as suggested.
  • Eventually move my models into a separate assembly.

What I consider doing, by going with common sense, probably wrong approach:

  • Doing dummy data reads at Application Start in order to warm things up, generate and validate the models.

Questions:

  • What would be the best approach to have high availability on my Entity Framework at anytime?
  • In what cases does the Entity Framework gets "cold" again? (Recompilation, Recycling, IIS Restart etc.)
Clang answered 6/11, 2012 at 12:5 Comment(11)
Figure out whether this is view generation or query compilation that hits you the most. If this is view gen then use precompiled views. If this is the queries - do you have a big complicated hierarchy? Note that expensive things usually happen once per app domain and are cached therefore you see this kind of problems when app domain is unloaded and a new one is created.Urger
I've mentioned view generation already @Pawel, the hierarchy is not complicated, not even a little bit. But the problem is principal as well. Following what you said, I'll research into when the app domain is being unloaded. However, that still doesn't help the other problem which is warming the Entity Framework in case, like you said, the app domain gets unloaded. At this point, it seems that the app domain is being unloaded more than it should be and I'm not sure why, recycling is only in the night, idling is set to 0.Clang
Why do you think doing dummy data reads is the wrong approach?Range
It just doesn't feel right, I thought there might be something more elegant that I'm not aware of. But if that's the only solution and someone with good knowledge can confirm there isn't another way, I'll just go with it.Clang
One issue I encountered with the app pool shutting down after a period of time of non-activity (due to low traffic) is to create a service that makes a request at a set interval of time to one of your pages. This prevents the long delay before the app pool is restarted on the first request. Or you can use a free service like www.pingalive.com to ping your domain/ip. This also helps prevent your cached objects from being cleared before they are expired.Denier
@CStyle You can achieve this internally without having to make a separate service, create a cache entry with a callback function which just makes the requests and inserts the cache entry again. You can specify times etc. I am using it successfully in one project, of course the first request need to be guaranteed somehow - warmup module (preload/autostart enabled) is a good way to do that.Clang
I just started firing off a thread at application start that creates, checks for Any() of any DBSet (doesn't seem to matter) and it shaves 2 seconds off the first page showing.Upton
Did you actually found query compilation to be slow by profiling? EF is also just very, very expensive to compile, which adds to the startup time, and it also needs quite a while to just create the model (in the case of code first). I don't think you can do much about that, although ngen helps at least a little bit.Amarillo
This is a rather old thread @John, but there are now some considerable improvements coming to start up times for Entity Framework I think the version might be out already, been a while I read about its development.Clang
@Peter Maybe there's been improvements, but in all my web apps, EF is definitely still the component that spoils the startup time by a long way. I've heard even EF Core still has an expensive warm up.Amarillo
@Amarillo Yeah that's true, but if you setup your app properly with some sort of internal keep-alive function, in addition to IIS preload functions that can invoke a certain web page to open, you can warm up the whole website, your cache and prepare EF for the first requests. That's what I do at least and if the servers are stable, the website never has this slow starting because it's technically always alive and running.Clang
S
54
  • What would be the best approach to have high availability on my Entity Framework at anytime?

You can go for a mix of pregenerated views and static compiled queries.

Static CompiledQuerys are good because they're quick and easy to write and help increase performance. However with EF5 it isn't necessary to compile all your queries since EF will auto-compile queries itself. The only problem is that these queries can get lost when the cache is swept. So you still want to hold references to your own compiled queries for those that are occurring only very rare, but that are expensive. If you put those queries into static classes they will be compiled when they're first required. This may be too late for some queries, so you may want to force compilation of these queries during application startup.

Pregenerating views is the other possibility as you mention. Especially, for those queries that take very long to compile and that don't change. That way you move the performance overhead from runtime to compile time. Also this won't introduce any lag. But of course this change goes through to the database, so it's not so easy to deal with. Code is more flexible.

Do not use a lot of TPT inheritance (that's a general performance issue in EF). Neither build your inheritance hierarchies too deep nor too wide. Only 2-3 properties specific to some class may not be enough to require an own type, but could be handled as optional (nullable) properties to an existing type.

Don't hold on to a single context for a long time. Each context instance has its own first level cache which slows down the performance as it grows larger. Context creation is cheap, but the state management inside the cached entities of the context may become expensive. The other caches (query plan and metadata) are shared between contexts and will die together with the AppDomain.

All in all you should make sure to allocate contexts frequently and use them only for a short time, that you can start your application quickly, that you compile queries that are rarely used and provide pregenerated views for queries that are performance critical and often used.

  • In what cases does the Entity Framework gets "cold" again? (Recompilation, Recycling, IIS Restart etc.)

Basically, every time you lose your AppDomain. IIS performs restarts every 29 hours, so you can never guarantee that you'll have your instances around. Also after some time without activity the AppDomain is also shut down. You should attempt to come up quickly again. Maybe you can do some of the initialization asynchronously (but beware of multi-threading issues). You can use scheduled tasks that call dummy pages in your application during times when there are no requests to prevent the AppDomain from dying, but it will eventually.

I also assume when you change your config file or change the assemblies there's going to be a restart.

Station answered 28/11, 2012 at 12:58 Comment(3)
@Station Actually even with static compiled queries, the first run is too long. Is there any way to warm up those apart from: Doing dummy data reads at Application Start in order to warm things up, generate and validate the models.Duster
@Station So Entity framework5 need it or not? what's different if use it on ef5(I mean still slow or little batter or not different?)Godiva
"Static CompiledQuerys are good because they're quick and easy to write and help reduce performance." Reduced performance?Lamrouex
N
8

If you are looking for maximum performance across all calls you should consider your architecture carefully. For instance, it might make sense to pre-cache often used look-ups in server RAM when the application loads up instead of using database calls on every request. This technique will ensure minimum application response times for commonly used data. However, you must be sure to have a well behaved expiration policy or always clear your cache whenever changes are made which affect the cached data to avoid issues with concurrency.

In general, you should strive to design distributed architectures to only require IO based data requests when the locally cached information becomes stale, or needs to be transactional. Any "over the wire" data request will normally take 10-1000 times longer to retrieve than an a local, in memory cache retrieval. This one fact alone often makes discussions about "cold vs. warm data" inconsequential in comparison to the "local vs. remote" data issue.

Nashoma answered 27/11, 2012 at 14:13 Comment(2)
This is a good point I'm often ignoring, while getting hyped about the entity framework raw performance. I'll look into this further and research more into the principles of caching. However, "cold vs. warm" in terms of EF is still something I want to understand better.Clang
"This one fact alone often makes discussions about "cold vs. warm data" inconsequential in comparison to the "local vs. remote" data issue." Not really. If you don't have it cached locally (which you won't initially), you'll still need to hit EF and suffer the initialization pain in order to prime your cache. The same places where your cache is uninitialized, EF will be uninitialized. So adding a layer of caching may not help if the only issue is the EF initialization time, but it will add another layer of complexity...Diligent
S
7

General tips.

  • Perform rigorous logging including what is accessed and request time.
  • Perform dummy requests when initializing your application to warm boot very slow requests that you pick up from the previous step.
  • Don't bother optimizing unless it's a real problem, communicate with the consumer of the application and ask. Get comfortable having a continuous feedback loop if only to figure out what needs optimization.

Now to explain why dummy requests are not the wrong approach.

  • Less Complexity - You are warming up the application in a manner that will work regardless of changes in the framework, and you don't need to figure out possibly funky APIs/framework internals to do it the right way.
  • Greater Coverage - You are warming up all layers of caching at once related to the slow request.

To explain when a cache gets "Cold".

This happens at any layer in your framework that applies a cache, there is a good description at the top of the performance page.

  • When ever a cache has to be validated after a potential change that makes the cache stale, this could be a timeout or more intelligent (i.e. change in the cached item).
  • When a cache item is evicted, the algorithm for doing this is described in the section "Cache eviction algorithm" in the performance article you linked, but in short.
    • LFRU (Least frequently - recently used) cache on hit count and age with a limit of 800 items.

The other things you mentioned, specifically recompilation and restarting of IIS clear either parts or all of the in memory caches.

Streetcar answered 29/11, 2012 at 5:43 Comment(1)
This is another helpful answer, much appreciated.Clang
M
4

As you have stated, use "pre-generated views" that's really all you need to do.

Extracted from your link: "When views are generated, they are also validated. From a performance standpoint, the vast majority of the cost of view generation is actually the validation of the views"

This means the performance knock will take place when you build your model assembly. Your context object will then skip the "cold query" and stay responsive for the duration of the context object life cycle as well as subsequent new object contexts.

Executing irrelevant queries will serve no other purpose than to consume system resources.

The shortcut ...

  1. Skip all that extra work of pre-generated views
  2. Create your object context
  3. Fire off that sweet irrelevant query
  4. Then just keep a reference to your object context for the duration of your process (not recommended).
Merino answered 14/11, 2012 at 20:37 Comment(1)
Check this: Using T4 Template for View Generation.Swept
O
1

I have no experience in this framework. But in other contexts, e.g. Solr, completely dummy reads will not be of much use unless you can cache the whole DB (or index).

A better approach would be to log the queries, extract the most common ones out of the logs and use them to warm up. Just be sure not to log the warm up queries or remove them from the logs before proceeding.

Ortolan answered 23/11, 2012 at 15:13 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.