Why is my Entity Framework query with Single slow?
Asked Answered
R

2

6

I have quite a simple query that is very slow. Entity Framework Profiler says it takes about 100 ms.

dbContext.Users.Single(u => u.Id == userId);

After trying around a bit I found a query that is very similar but much faster (about 3 ms).

dbContext.Users.Where(u => u.Id == userId).ToList().Single();

When I compare the sql of the two queries the second query does not use a nested SELECT and no TOP operation. But I would not expect it to be 30 times faster just because of these two things. Also when executing the queries using SQL Server Management Studio there is no difference measurable.

When I look at the execution plan they both make a clustered index seek which has 100% query cost. whereas the additional select and the Top operation have 0% query cost. The query plan from EFProfiler says the same indicating that it should not make any difference.

What can I do to get a better understanding about the query performance in this case?

Below is the resulting SQL for the first query.

SELECT [Limit1].[Id]                     AS [Id],
   [Limit1].[EmailAddress]           AS [EmailAddress],
   [Limit1].[FirstName]              AS [FirstName],
   [Limit1].[LastName]               AS [LastName]
FROM   (SELECT TOP (2) [Extent1].[Id]                     AS [Id],
                   [Extent1].[EmailAddress]           AS [EmailAddress],
                   [Extent1].[FirstName]              AS [FirstName],
                   [Extent1].[LastName]               AS [LastName]
    FROM   [dbo].[Users] AS [Extent1]
    WHERE  ([Extent1].[Id] = 'b5604f88-3e18-42a5-a45e-c66cc2a632d3' /* @p__linq__0 */)
           AND ('b5604f88-3e18-42a5-a45e-c66cc2a632d3' /* @p__linq__0 */ IS NOT NULL))    AS [Limit1]

Here the sql of the second (faster) query.

SELECT [Extent1].[Id]                     AS [Id],
   [Extent1].[EmailAddress]           AS [EmailAddress],
   [Extent1].[FirstName]              AS [FirstName],
   [Extent1].[LastName]               AS [LastName]
FROM   [dbo].[Users] AS [Extent1]
WHERE  ([Extent1].[Id] = 'b5604f88-3e18-42a5-a45e-c66cc2a632d3' /* @p__linq__0 */)
   AND ('b5604f88-3e18-42a5-a45e-c66cc2a632d3' /* @p__linq__0 */ IS NOT NULL)
Rolfrolfe answered 7/5, 2014 at 13:5 Comment(11)
You could also omit the ToList() call in your second example. Single() also enumerates over a query. Just keep the Where() clause.Jacquejacquelin
What happens when you run the same queries in Management Studio? Does the nested one execute slower or not? Where did you get those times from? On the client (as response time) or on the DB itself?Jacquejacquelin
.Single() needs the "top 2" results returned to determine if there is a single match. One row in the result means success, two rows or zero rows means that ".Single()" failed and should throw an error. In your 2nd example, the ".ToList()" causes the query to execute prior to the application of ".Single()". The query could return any number of results, an the .Single() method will be applied to the List result.Vibratile
Are you trying the two queries out inside the same application, if so is the slower of the two the very first query Entity Framework makes in your application? Entity framework as a very significant overhead for the very first query performed per database per app domain, it re-uses the cached metadata it builds after the first query even if the object is disposed, but that very first query is going to be much much slower (just running the same query a 2nd time later in the program you will see a significant change in speed)Dusen
@Jaycee The clustered index is from the primary key which is the IdRolfrolfe
@RobertKoritnik The numbers are from Entity Framework Profiler. As I understand it they are taken client-side and also include the in-memory operations made by EF.Rolfrolfe
@Towa I am surprised a clustered index on a GUID is not degrading performanceGowon
So did you try running these two one the database and check their execution time? Because if there's no difference there you should run the same thing using EF consecutively and see if the first call is slower but subsequent ones are faster. And if even that doesn't bring any particular results I suggest you use the second query as it's more concise (from the TSQL perspective) and also produces better results.Jacquejacquelin
@Jaycee: It likely is, but it could be that his table is small so its not heavily paged if at all... GUIDs make bad primary keys...Jacquejacquelin
@ScottChamberlain These queries are not the first query run. There are several other which are run before.To prevent the cache-buildup from affecting my numbers I always run the queries several times.Rolfrolfe
@RobertKoritnik - I know this is old but I don't agree with your comment "GUIDs make bad primary keys". Having the PK be a GUID is fine and provides many benefits especially with disconnected systems AND there is a way to get around the performance issue. Solution: Don't make the PK be the cluster index. Instead create another identity column which increments by 1 (e.g. "ClusterId") and use that as the cluster index. The purpose of the ClusterId column is strictly for sequential DB storage and not used by anyone while the GUID is the PK used for relationships.Wrong
C
2

What you actually want is dbContext.Users.Find(id) - this will not even go to the database if it does not have to. See more details on msdn.

Cape answered 8/5, 2014 at 1:15 Comment(1)
+1 for telling me about Find(). I didn't know this before. But in this special case it generates the same SQL as with Single and also has the same (bad) performance.Rolfrolfe
R
0

when you say

dbContext.Users.Single(u => u.Id == userId);

Users is of type DbSet or its a collection of users so it is first fetching collection of users while in
dbContext.Users.Where(u => u.Id == userId).ToList().Single(); it contains condition for loading it.

So if there were 100 users first query will fetch 100 users and will then do filtering while in second one it will only fetch 1.

Hope this helps.

Rus answered 7/5, 2014 at 15:32 Comment(2)
Anshul, the first query will only return the first two users from the database that match that userid. It doesn't fetch all 100 users. Even if there are 100 users with that userid. The second will return all users that match that userid then in memory will apply the single method.Buine
why not ask your question here (but different thread) on stackoverflow? :)Buine

© 2022 - 2024 — McMap. All rights reserved.