Entity Framework cached query plan performance degrades with different parameters
D

2

7

I have the following problem.

Background

I'm trying to implement an autocomplete selector with MVC3, EF4 and jquery over a table wit 4.5 million records.

This is the table:

CREATE TABLE [dbo].[CONSTA] (
  [afpCUIT] nvarchar(11) COLLATE Modern_Spanish_CI_AS NOT NULL,
  [afpNombre] nvarchar(30) COLLATE Modern_Spanish_CI_AS NULL,
  [afpGanancias] varchar(2) COLLATE Modern_Spanish_CI_AS NULL,
  [afpIVA] varchar(2) COLLATE Modern_Spanish_CI_AS NULL,
  [afpMonot] varchar(2) COLLATE Modern_Spanish_CI_AS NULL,
  [afpIntSoc] varchar(1) COLLATE Modern_Spanish_CI_AS NULL,
  [afpEmpl] varchar(1) COLLATE Modern_Spanish_CI_AS NULL,
  [afpAct] varchar(2) COLLATE Modern_Spanish_CI_AS NULL,
  CONSTRAINT [CONSTA_pk] PRIMARY KEY CLUSTERED ([afpCUIT])
)
ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [CONSTA_Nombre_idx] ON [dbo].[CONSTA]
  ([afpNombre])
WITH (
  PAD_INDEX = OFF,
  DROP_EXISTING = OFF,
  STATISTICS_NORECOMPUTE = OFF,
  SORT_IN_TEMPDB = OFF,
  ONLINE = OFF,
  ALLOW_ROW_LOCKS = OFF,
  ALLOW_PAGE_LOCKS = OFF)
ON [PRIMARY]
GO

The table is pretty static (it only needs a monthly batch update) and read only.

if somebody cares to download the records (54MB) this is the URL:

http://www.afip.gob.ar/genericos/cInscripcion/22102011.zip

and here is the record description:

http://www.afip.gob.ar/genericos/cInscripcion/archivoCompleto.asp

Here is the code of the app:

CONTROLLER:

public class AltaMasivaController : Controller
{
    //
    // GET: /AltaMasiva/

    public ActionResult Index()
    {
        return View();
    }

    public JsonResult GetUsers(string query)
    {
        CENT2Entities db = new CENT2Entities();
        bool isCUIT = true;

        for(int j = 0; j < query.Length; j++)
            if (! Char.IsDigit(query, j))
            {
                isCUIT = false;
                break;
            }

        if (isCUIT)
        {
            // nvarchar search
            var x = from u in db.CONSTA
                    where u.afpCUIT.StartsWith(query)
                    orderby u.afpNombre
                    select new { label = u.afpNombre.TrimEnd(), id = u.afpCUIT };

            return Json(x.Take(50), JsonRequestBehavior.AllowGet);
        }
        else
        {
            // nvarchar search
            var x = from u in db.CONSTA
                    where u.afpNombre.StartsWith(query)
                    orderby u.afpNombre
                    select new { label = u.afpNombre.TrimEnd(), id = u.afpCUIT };

            return Json(x.Take(50), JsonRequestBehavior.AllowGet);
        }
    } 
}

VIEW:

@{
    viewbag.title = "index";
}

<h2>index</h2>
@html.textbox("user", "", new { style="width: 400px;" })

<script type="text/javascript">

$("input#user").autocomplete(
{ 
    source: function (request, response) 
    { 
        // define a function to call your action (assuming usercontroller) 
        $.ajax(
        { 
            url: '/altamasiva/getusers', type: "post", datatype: "json", 

            // query will be the param used by your action method 
            data: { query: request.term }, 

            success: function(data){ 
                response( $.map(data, function (item){ return { label: item.label + " (" + item.id + ")", value: item.label, id: item.id }; })); 
            } 
        }) 
    }, 
    minlength: 1, // require at least one character from the user
});

</script>

And now:

THE PROBLEM

As you can see, The code follows different paths if the query string contains only numbers.

When all the characters of the controller parameter are numbers (where u.afpCUIT.StartsWith(query) ), the query optimizer "is supposed to" perform a clustered index seek (which it does) and return the first 50 rows it finds. When the first "autocomplete" string arrives (usually one or two characters at most) the query performs extraordinarily fast, but, when the lenght of the string increases, the performance degrades notably (it takes almost between 20 seconds to 2 minutes with 9 or more chars). Surprisingly, after "restarting" the SQL Server Service, if the initial string contains 10 chars, it performs great too, but the performance degrades when we delete chars from the "query" string, the complete opposite.

Why is this happening?

When SQL server compiles the first execution plan, it optimizes it to perform really fast with a large result set (or viceversa). Subsequent queries, which narrows (or expands) the resultset, require a different execution plan ... BUT ... EF generated SQL uses commad parameters to (precisely) avoid statement recompiling ...

Cleaning the Execution Plan Cache by executing:

db.ExecuteStoreCommand("DBCC FREEPROCCACHE");

restores the performance to excellent response times ... BUT ... it kills al plans in all databases, thus degrading the performance of all other cached plans (which generally perform OK).

After doing some profiling on the EF sql statements, I executed DBCC FREEPROCCACHE in Query Analyzer prior to the sql EF generates, which turned out to generate different execution plans, all performing in the 250ms range, independently of the parameter length:

DBCC FREEPROCCACHE

exec sp_executesql N'SELECT TOP (50) 
[Project1].[C1] AS [C1], 
[Project1].[C2] AS [C2], 
[Project1].[afpCUIT] AS [afpCUIT]
FROM ( SELECT 
    [Extent1].[afpCUIT] AS [afpCUIT], 
    [Extent1].[afpNombre] AS [afpNombre], 
    1 AS [C1], 
    RTRIM([Extent1].[afpNombre]) AS [C2]
    FROM [dbo].[CONSTA] AS [Extent1]
    WHERE [Extent1].[afpCUIT] LIKE @p__linq__0 ESCAPE N''~''
)  AS [Project1]
ORDER BY [Project1].[afpNombre] ASC',N'@p__linq__0 nvarchar(4000)',@p__linq__0=N'2023291%'

THE QUESTION

Is there a more ellegant alternative to

db.ExecuteStoreCommand("DBCC FREEPROCCACHE");

?

Surprisingly, the second path of the query ( where u.afpNombre.StartsWith(query) ) are not affected by the same problem and performs great. Obviously, execution plans do not change when the lenght of the string changes...

I found an ObjectContext parameter in older versions of EF:

System.Data.EntityClient.EntityCommand.EnablePlanCaching

but I couldn't find it in EF4, and I'm not sure if the global results would be the same.

I'm really puzzled with this problem, and I don't know where the real problem lies

Poor index design? Lack of partitions? SQL SERVER 2008 Express edition? EF generated SQL? Lousy luck?

Any help would be great. Thanx in advance!

Diapason answered 1/11, 2011 at 5:28 Comment(2)
You could easily switch out EF4 with a stores procedure if you think that EF is causing the issue. Have you used proper SQL Server profiling tools (like Quest) and checked buffer flushes, disk I/O, etc? Have you thought about increasing the RAM made available to SQL Server?Epitasis
Did it already, but with the same lame results. I was tempted to make a condition on the length of the "query" param, but it would only work with the same statistics... Regarding the server resources, they are not the problem, since the query performs great in a laptop (as long as the query plan is not cached) Thank you very much anyway!Ide
P
1

There's a way to remove a single plan from SQL Server's cache. It's explained in detail here: http://sqlblog.com/blogs/kalen_delaney/archive/2007/09/29/geek-city-clearing-a-single-plan-from-cache.aspx

Also, you can create a Stored Procedure, and map it with Entity Framework instead of using LINQ2Entities, and in this way make spcific changes to the SQL syntax, and make sure it's always the same.

Pulsar answered 1/11, 2011 at 10:25 Comment(2)
Svarog, Thank you very much, the blog really hits the nail (it issues the exact same problem I have) and provided me: a) a "more elegant" workaround I can use inmediately in development: flushing only the problematic database proc (not the entire server cache) cache with: db.ExecuteStoreCommand("DECLARE @myDb AS INT = DB_ID(); DBCC FLUSHPROCINDB(@myDb)"); b) a workaround for production environment, since sp_create_plan_guide is not available for SQL EXPRESS. I'm not sure this is a problem in Enterprise, but I'm begining to think this will have to do in production. Thank you very much!Ide
This link appears to be dead. Can the relevant information be found, and the information added to the answer instead of just a link? It appears that the information in @JerónimoVargas comment may be all that is needed, but the implication is that more info exists.Brooch
M
0

As you identified, SQL Server compiles the plan to be optimized for one parameter value with large result set. When the result set is narrowed, the query doesn't perform well.

This scenario requires the use of "option (recompile)" hint in the query, so the query will be recompiled for each value it receives.

It's not so easy to do this with entity framework. You will need to create a DbCommandInterceptor to include option (recompile) in the query. Another option is to create a plan guide in SQL Server to add the "option (recompile)" to the query.

You will find information about the DbCommandInterceptor here - Adding a query hint when calling Table-Valued Function

About the plan guide, you will need something similar to this:

EXEC sp_create_plan_guide   
'planguidename',   
N'SELECT TOP (50) 
[Project1].[C1] AS [C1], 
[Project1].[C2] AS [C2], 
[Project1].[afpCUIT] AS [afpCUIT]
FROM ( SELECT 
    [Extent1].[afpCUIT] AS [afpCUIT], 
    [Extent1].[afpNombre] AS [afpNombre], 
    1 AS [C1], 
    RTRIM([Extent1].[afpNombre]) AS [C2]
    FROM [dbo].[CONSTA] AS [Extent1]
    WHERE [Extent1].[afpCUIT] LIKE @p__linq__0 ESCAPE N''~''
)  AS [Project1]
ORDER BY [Project1].[afpNombre] ASC',
'SQL',   
NULL,   
N'@p__linq__0 nvarchar(4000)',
N'OPTION (recompile)'
Margarito answered 24/5, 2017 at 6:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.