How can I keep Entity Framework from generating inefficient queries in SQL Server?
Asked Answered
C

2

5

I have an entity defined in an EF 4.0 that is based on a view. The view lays over a table with about 18 million rows of data. I have selected the 4 deterministic properties of the entity as a compound Entity key for this entity. I have exposed access to the model containing this view via an OData WCF data service. The WCF Data service is set to limit results like so

config.SetEntitySetPageSize("*", 100)

When I make a basic query against this view:

var fcbs = this.iBenchmarkCostContext.FtCostBenchmarks.ToArray();

This is the query that I see in my profiling tool:

SELECT TOP (100) [Extent1].[MonthBeginDt]                     AS [MonthBeginDt],
                 [Extent1].[dmCostBenchmarkKey]               AS [dmCostBenchmarkKey],
                 [Extent1].[dmProductKey]                     AS [dmProductKey],
                 [Extent1].[IsImputedFlg]                     AS [IsImputedFlg],
                 [Extent1].[ProjectedCopayPerRxAmt]           AS [ProjectedCopayPerRxAmt],
                 [Extent1].[ProjectedPricePerQtyAmt]          AS [ProjectedPricePerQtyAmt],
                 [Extent1].[ProjectedQtyPerRxQty]             AS [ProjectedQtyPerRxQty],
                 [Extent1].[ProjectedRxCnt]                   AS [ProjectedRxCnt],
                 [Extent1].[AvgPriceRxAvgPriceQtyDenominator] AS [AvgPriceRxAvgPriceQtyDenominator],
                 [Extent1].[AvgQtyDenominator]                AS [AvgQtyDenominator],
                 [Extent1].[AvgCopayDenominator]              AS [AvgCopayDenominator],
                 [Extent1].[ProjectedTotalCostAmt]            AS [ProjectedTotalCostAmt],
                 [Extent1].[AllowedRxCnt]                     AS [AllowedRxCnt],
                 [Extent1].[CopayRxCnt]                       AS [CopayRxCnt],
                 [Extent1].[TotalAllowedAmt]                  AS [TotalAllowedAmt],
                 [Extent1].[TotalCopayAmt]                    AS [TotalCopayAmt],
                 [Extent1].[TotalCostPerUnitAmt]              AS [TotalCostPerUnitAmt],
                 [Extent1].[TotalUnitQty]                     AS [TotalUnitQty],
                 [Extent1].[RC]                               AS [RC]
FROM   (SELECT [ftCostBenchmark].[MonthBeginDt]                     AS [MonthBeginDt],
               [ftCostBenchmark].[dmCostBenchmarkKey]               AS [dmCostBenchmarkKey],
               [ftCostBenchmark].[dmProductKey]                     AS [dmProductKey],
               [ftCostBenchmark].[IsImputedFlg]                     AS [IsImputedFlg],
               [ftCostBenchmark].[ProjectedCopayPerRxAmt]           AS [ProjectedCopayPerRxAmt],
               [ftCostBenchmark].[ProjectedPricePerQtyAmt]          AS [ProjectedPricePerQtyAmt],
               [ftCostBenchmark].[ProjectedQtyPerRxQty]             AS [ProjectedQtyPerRxQty],
               [ftCostBenchmark].[ProjectedRxCnt]                   AS [ProjectedRxCnt],
               [ftCostBenchmark].[AvgPriceRxAvgPriceQtyDenominator] AS [AvgPriceRxAvgPriceQtyDenominator],
               [ftCostBenchmark].[AvgQtyDenominator]                AS [AvgQtyDenominator],
               [ftCostBenchmark].[AvgCopayDenominator]              AS [AvgCopayDenominator],
               [ftCostBenchmark].[ProjectedTotalCostAmt]            AS [ProjectedTotalCostAmt],
               [ftCostBenchmark].[AllowedRxCnt]                     AS [AllowedRxCnt],
               [ftCostBenchmark].[CopayRxCnt]                       AS [CopayRxCnt],
               [ftCostBenchmark].[TotalAllowedAmt]                  AS [TotalAllowedAmt],
               [ftCostBenchmark].[TotalCopayAmt]                    AS [TotalCopayAmt],
               [ftCostBenchmark].[TotalCostPerUnitAmt]              AS [TotalCostPerUnitAmt],
               [ftCostBenchmark].[TotalUnitQty]                     AS [TotalUnitQty],
               [ftCostBenchmark].[RC]                               AS [RC]
        FROM   [dbo].[ftCostBenchmark] AS [ftCostBenchmark]) AS [Extent1]
ORDER  BY [Extent1].[MonthBeginDt] ASC,
          [Extent1].[dmCostBenchmarkKey] ASC,
          [Extent1].[dmProductKey] ASC,
          [Extent1].[IsImputedFlg] ASC

Although I have not explicitly requested any ordering, an order by clause is added that includes the fields included in the compound entity key defined for the Entity. Execution of this query takes an inordinate amount of time and generates page locks on the database. Removal of the ORDER BY from the query in the SQL Server environment returns results in less than a milisecond.

So my question is:

How can I stop EF from adding that order by clause to the query?

Cowhide answered 27/1, 2012 at 20:20 Comment(0)
U
0

If there is consistency in the names and order of the columns in the 'order by' clause, there is likely desirability for this result by the users.

Just put an index on the table, or modify the primary key, so that index can be used in the retrieval.

create index x on [dbo].[ftCostBenchmark]
                      ([MonthBeginDt],
                       [dmCostBenchmarkKey],
                       [dmProductKey],
                       [IsImputedFlg])
                  include
                       (existing-primarykey-on-the-table)

Odd that there is no where clause on the query. If you trimmed a where-clause for brevity, put those fields first in the index.

Unmanly answered 27/1, 2012 at 21:8 Comment(3)
This was a simple test of connectivity and data retrieval from a WCF Data service. If you will note comments above you will see that the entity is laid over a view and the underlying table is indexed on the fields in the order by. In normal use there would be a where clause as well, but for this test I wanted to be assured that should a user make a wide open query, results would still come back. The WCF service impliments rules to restrict open queries to a max of 100 rows.Cowhide
I just need to figure out how to get rid of the order by that WCF is adding. It is causing the clustered index scan to be expanded to the entire database rather than the top 10 rows.Cowhide
There is a clustered index on the table that the view is based on, but no index on the view itself. I beleive you are correct that adding the apprpriate index will resolve the performance issue. Unfortunately I don't own that database and will have to wait on the resources of another department to resolve the issue. Thanks for your responses.Cowhide
D
5

Row limits don't make any sense without specifying an order.

If you want to retrieve records in storage order, create a clustered index and explicitly specify it in your query.

One way to do this is with a query interceptor.

Dunkle answered 27/1, 2012 at 20:35 Comment(12)
The underlying table has a clustered index on the same 4 fields. This entity is to be exposed on an odata service. This issue was encountered while attempting a simple test query against the table. The row limit exists incase the client makes a wide open query on the entity. In such a case, no order is required. The user gets what they get. The issue I have is that EF, of its own accord, adds the order by. I do not specify it and I don't want it. How can I stop EF from adding it?Cowhide
Oh, and removing the directive to config.SetEntitySetPageSize("*", 100) does not change this behavior.Cowhide
If there's a clustered index on those fields and you're still having performance/locking issues, then something else is going on. Look at the query execution plan. Are the 4 fields added in the same order as the index? No, EF doesn't add order bys "on its own." WCF Data Services might, though.Dunkle
No, the fields are not added in the same order as the index. I re-ordered them in the EF created query and this did not change the behavior. I have not tested this outside of the service. I will try it in a project that references the Model assembly directly and see if that makes a difference in the generated SQL.Cowhide
That is to say I reordered the orderby list of fields and tried to run it in SQL server directly. Looking at the execution plan, SQL server is doing a Clustered Index Scan of the underlying index (18m rows) and then attempting a hash match of 13m rows to combine the results with those of another index and then it is trying to sort the 13m.Cowhide
The general strategy I would suggest is (1) Find an ordered SQL statement which works (you won't want to be doing full retrievals of 18m rows, so you need paging, and w/ paging you need ordering) and then (2) write a query interceptor which causes the same SQL.Dunkle
Connection directly to the model using a reference does not include the order by. So your surmise would seem to be correct. This is someting that the WCF Data Service is adding.Cowhide
With the Orderby, the Custered Index Scan is 18m, Without the Orderby it is an estimated 95 rows.Cowhide
The question is, how could I, even in a query interceptor, remove an order by clause that I did not issue in the first place?Cowhide
I am not understanding why you think you can get rid of it. You can't possibly want to return 18m rows from a web service call. So you must either restrict or page, and paging requires an order by. So you need to either find an order by which works or use a where clause.Dunkle
The odd thing is that the order by is added whether the paging is enabled or not. Of course I don't want to serve up 18m records. I am just trying to understand why manually passing the Take(10) and running the query against the service still generates the order by. Running the SQL generated by that request with the Top 10 command and the order by removed returns instantaniously. This is of course what I would like to happen. The order by is not helping, whether the order I put the fields in matches the index or not. In the end I just want a reckless user quey to return the first pageCowhide
and to do so without timing out, which is what happens when you try to run the ordered query.Cowhide
U
0

If there is consistency in the names and order of the columns in the 'order by' clause, there is likely desirability for this result by the users.

Just put an index on the table, or modify the primary key, so that index can be used in the retrieval.

create index x on [dbo].[ftCostBenchmark]
                      ([MonthBeginDt],
                       [dmCostBenchmarkKey],
                       [dmProductKey],
                       [IsImputedFlg])
                  include
                       (existing-primarykey-on-the-table)

Odd that there is no where clause on the query. If you trimmed a where-clause for brevity, put those fields first in the index.

Unmanly answered 27/1, 2012 at 21:8 Comment(3)
This was a simple test of connectivity and data retrieval from a WCF Data service. If you will note comments above you will see that the entity is laid over a view and the underlying table is indexed on the fields in the order by. In normal use there would be a where clause as well, but for this test I wanted to be assured that should a user make a wide open query, results would still come back. The WCF service impliments rules to restrict open queries to a max of 100 rows.Cowhide
I just need to figure out how to get rid of the order by that WCF is adding. It is causing the clustered index scan to be expanded to the entire database rather than the top 10 rows.Cowhide
There is a clustered index on the table that the view is based on, but no index on the view itself. I beleive you are correct that adding the apprpriate index will resolve the performance issue. Unfortunately I don't own that database and will have to wait on the resources of another department to resolve the issue. Thanks for your responses.Cowhide

© 2022 - 2024 — McMap. All rights reserved.