Why SQL Server execution plan depends on comparison order
Asked Answered
C

2

6

I was optimising a query on SQL Server and ran into something I was not expecting. There is a table tblEvent in the database, among other columns it has IntegrationEventStateId and ModifiedDateUtc. There is an index by these columns:

create index IX_tblEvent_IntegrationEventStateId_ModifiedDateUtc
on dbo.tblEvent (
    IntegrationEventStateId,
    ModifiedDateUtc
)

When I execute the following statement:

select *
from dbo.tblEvent e
where
    e.IntegrationEventStateId = 1
    or e.IntegrationEventStateId = 2
    or e.IntegrationEventStateId = 5
    or (e.IntegrationEventStateId = 4 and e.ModifiedDateUtc >= dateadd(minute, -5, getutcdate()))

I get this execution plan (note the index does NOT get used):

enter image description here

But when I execute this statement:

select *
from dbo.tblEvent e
where
    1 = e.IntegrationEventStateId
    or 2 = e.IntegrationEventStateId
    or 5 = e.IntegrationEventStateId
    or (4 = e.IntegrationEventStateId and e.ModifiedDateUtc >= dateadd(minute, -5, getutcdate()))

I get this execution plan (note the index DOES get used):

enter image description here

The only difference between the two statements is the order of comparisons in the where clause. Can anyone please explain why I get different execution plans?

Update 1 - a full repro script is below

CREATE TABLE dbo.tblEvent
(
   EventId                 INT IDENTITY PRIMARY KEY,
   IntegrationEventStateId INT,
   ModifiedDateUtc         DATETIME,
   OtherCol                CHAR(1),
   index IX_tblEvent_IntegrationEventStateId_ModifiedDateUtc(IntegrationEventStateId, ModifiedDateUtc)
);

INSERT INTO dbo.tblEvent
SELECT TOP 356525 3,
                  DATEADD(SECOND, ROW_NUMBER() OVER (ORDER BY @@SPID)%63424, GETUTCDATE()),
                  'A'
FROM   sys.all_objects o1,
       sys.all_objects o2;

UPDATE STATISTICS dbo.tblEvent WITH FULLSCAN


select *
from dbo.tblEvent e 
where
    e.IntegrationEventStateId = 1
    or e.IntegrationEventStateId = 2
    or e.IntegrationEventStateId = 5
    or (e.IntegrationEventStateId = 4 and e.ModifiedDateUtc >= dateadd(minute, -5, getutcdate()))


select *
from dbo.tblEvent e
where
    1 = e.IntegrationEventStateId
    or 2 = e.IntegrationEventStateId
    or 5 = e.IntegrationEventStateId
    or (4 = e.IntegrationEventStateId and e.ModifiedDateUtc >= dateadd(minute, -5, getutcdate()))

Update 2 - DDL of the original table

CREATE TABLE [dbo].[tblEvent]
(
[EventId] [int] NOT NULL IDENTITY(1, 1),
[EventTypeId] [int] NOT NULL,
[ScorecardId] [int] NULL,
[ScorecardAreaId] [int] NULL,
[AreaId] [int] NULL,
[ScorecardTopicId] [int] NULL,
[TopicId] [int] NULL,
[ScorecardRequirementId] [int] NULL,
[RequirementId] [int] NULL,
[DocumentId] [int] NULL,
[FileId] [int] NULL,
[TopicTitle] [nvarchar] (100) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[ScorecardTopicStatus] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[RequirementText] [nvarchar] (500) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[ScorecardRequirementStatus] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[DocumentName] [nvarchar] (260) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[CreatedByUserId] [int] NOT NULL,
[CreatedByUserSessionId] [int] NOT NULL,
[CreatedDateUtc] [datetime2] (4) NOT NULL CONSTRAINT [DF__tblEvent__Create__0737E4A2] DEFAULT (sysutcdatetime()),
[CreatedDateLocal] [datetime2] (4) NOT NULL CONSTRAINT [DF__tblEvent__Create__082C08DB] DEFAULT (sysdatetime()),
[ModifiedByUserId] [int] NOT NULL,
[ModifiedByUserSessionId] [int] NOT NULL,
[ModifiedDateUtc] [datetime2] (4) NOT NULL CONSTRAINT [DF__tblEvent__Modifi__09202D14] DEFAULT (sysutcdatetime()),
[ModifiedDateLocal] [datetime2] (4) NOT NULL CONSTRAINT [DF__tblEvent__Modifi__0A14514D] DEFAULT (sysdatetime()),
[IsDeleted] [bit] NOT NULL,
[RowVersion] [timestamp] NOT NULL,
[ScorecardRequirementPriority] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[AffectedUserId] [int] NULL,
[UserId] [int] NULL,
[CorrelationId] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[IntegrationEventStateId] [int] NULL,
[IntegrationEventId] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[IntegrationEventContent] [nvarchar] (max) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[IntegrationEventType] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[IntegrationEventTryCount] [int] NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
ALTER TABLE [dbo].[tblEvent] ADD CONSTRAINT [PK_dbo.tblEvent] PRIMARY KEY CLUSTERED ([EventId]) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IX_tblEvent_IntegrationEventStateId_ModifiedDateUtc] ON [dbo].[tblEvent] ([IntegrationEventStateId], [ModifiedDateUtc]) ON [PRIMARY]
GO
ALTER TABLE [dbo].[tblEvent] ADD CONSTRAINT [FK_dbo.tblEvent_dbo.tblEventType_EventTypeId] FOREIGN KEY ([EventTypeId]) REFERENCES [dbo].[tblEventType] ([EventTypeId])
GO
ALTER TABLE [dbo].[tblEvent] ADD CONSTRAINT [FK_dbo.tblEvent_dbo.tblIntegrationEventState_IntegrationEventStateId] FOREIGN KEY ([IntegrationEventStateId]) REFERENCES [dbo].[tblIntegrationEventState] ([IntegrationEventStateId])
GO
Coccidioidomycosis answered 28/3, 2020 at 0:21 Comment(20)
Which version of SQL Server and what database compatibility level? Different execution planners get used for different versions (and compatibility levels), so such a behaviour difference would be very interesting. Is it repeatable, or did the planner just take a different strategy because statistics on the index changed giving it better cardinality for cost assessments?Yoon
Hi @AlwaysLearning. SQL Server version: Microsoft SQL Server 2016 (SP1-CU5) (KB4040714) - 13.0.4451.0 (X64) Sep 5 2017 16:12:34 Copyright (c) Microsoft Corporation Standard Edition (64-bit) on Windows Server 2012 R2 Standard 6.3 <X64> (Build 9600: ) (Hypervisor). DB compatibility level: SQL Server 2016 (130). The behaviour is repeatable, I ran the queries multiple times and consistently got the above execution plans.Coccidioidomycosis
Can you supply the XML for both execution plans? e.g. to brentozar.com/pastetheplanLogger
@MartinSmith Done, here are the links: brentozar.com/pastetheplan/?id=ry40ir3I8, brentozar.com/pastetheplan/?id=SJcShB2LICoccidioidomycosis
It may have just cached a bad plan and when you changed the syntax it decided on a new (better) plan. What happens if you add where 1=1 to the first plan?Washer
That's just weird... the second plan includes StatementOptmEarlyAbortReason="GoodEnoughPlanFound", the first does not. Still looking.Yoon
Actually, Nick.McDermaid may have a point... Do you still get the same query plans if you execute DBCC FREEPROCCACHE beforehand? (I'm assuming this isn't a production server, by the way.)Yoon
@Nick.McDermaid, @Yoon I have run the original queries, as well as queries with added and 1=1, and queries with added or 1=0, executing DBCC FREEPROCCACHE every time before I execute the query. No change in exec plans.Coccidioidomycosis
Intrguing. This may just add to the mystery but are these estimated or actual? what do the actual execution plans look like for those?Washer
@Nick.McDermaid Those are actual plans.Coccidioidomycosis
For some reason in the first plan it estimates 356,525 rows are going to meet the predicate - i.e. 100% of the rows in the table. This is probably why it doesn't do a plan with seek + lookups. Unclear why it would estimate that when the estimates for the individual seeks add up to 4 in the second planLogger
What do the statistics on IX_tblEvent_IntegrationEventStateId_ModifiedDateUtc look like? I can reproduce the bad estimates if I insert 356,525 rows - all of which have 3 for IntegrationEventStateId - but this applies to both queriesLogger
what is the datatype of ModifiedDateUtc?Cyano
@Cyano ModifiedDateUtc definition is: [ModifiedDateUtc] [datetime2] (4) NOT NULL CONSTRAINT [DF__tblEvent__Modifi__09202D14] DEFAULT (sysutcdatetime())Coccidioidomycosis
@MartinSmith I've added index statistics to the question. I also re-ran both queries with dbcc freeproccache, to confirm that I still get the same exec plans with the statistics I'm posting.Coccidioidomycosis
@Cyano FYI, I just tried to change the date in the queries to a variable of type datetime2(4), it made no difference.Coccidioidomycosis
FYI all, I posted another question, which explains how this question came about: #60913235. You might find it interesting.Coccidioidomycosis
Can you post the full table and index DDL?Armchair
@DavidBrowne-Microsoft - I've added a reproLogger
@DavidBrowne-Microsoft I've added DDL to the question.Coccidioidomycosis
P
5

There are a number of issues here, but the most important is cardinality estimation (CE).

The newer ("default") CE model has a hard time with the predicates when it tries to compute selectivity against the histogram with no matching steps.

For example, the initial cardinality estimate returns a selectivity of 1 for:

select *
from dbo.tblEvent e
where
    1 = e.IntegrationEventStateId
    or 2 = e.IntegrationEventStateId
    or 5 = e.IntegrationEventStateId
    or (4 = e.IntegrationEventStateId and e.ModifiedDateUtc >= dateadd(minute, -5, getutcdate()))

as shown using trace flags 3604 and 2363:

Begin selectivity computation

Input tree:

  LogOp_Select
      CStCollBaseTable(ID=1, CARD=356525 TBL: dbo.tblEvent AS TBL: e)
      ScaOp_Logical x_lopOr
          ScaOp_Comp x_cmpEq
              ScaOp_Identifier QCOL: [e].IntegrationEventStateId
              ScaOp_Const TI(int,ML=4) XVAR(int,Not Owned,Value=1)
          ScaOp_Comp x_cmpEq
              ScaOp_Identifier QCOL: [e].IntegrationEventStateId
              ScaOp_Const TI(int,ML=4) XVAR(int,Not Owned,Value=2)
          ScaOp_Comp x_cmpEq
              ScaOp_Identifier QCOL: [e].IntegrationEventStateId
              ScaOp_Const TI(int,ML=4) XVAR(int,Not Owned,Value=5)
          ScaOp_Logical x_lopAnd
              ScaOp_Comp x_cmpGe
                  ScaOp_Identifier QCOL: [e].ModifiedDateUtc
                  ScaOp_Identifier COL: ConstExpr1001 
              ScaOp_Comp x_cmpEq
                  ScaOp_Identifier QCOL: [e].IntegrationEventStateId
                  ScaOp_Const TI(int,ML=4) XVAR(int,Not Owned,Value=4)

Plan for computation:

  CSelCalcCombineFilters_ExponentialBackoff (OR)
      CSelCalcCombineFilters_ExponentialBackoff (AND)
          CSelCalcColumnInInterval
              Column: QCOL: [e].ModifiedDateUtc
          CSelCalcColumnInInterval
              Column: QCOL: [e].IntegrationEventStateId
      CSelCalcColumnInInterval
          Column: QCOL: [e].IntegrationEventStateId

Loaded histogram for column QCOL: [e].ModifiedDateUtc from stats with id 3
Loaded histogram for column QCOL: [e].IntegrationEventStateId from stats with id 2

Selectivity: 1

Stats collection generated: 

  CStCollFilter(ID=2, CARD=356525)
      CStCollBaseTable(ID=1, CARD=356525 TBL: dbo.tblEvent AS TBL: e)

End selectivity computation

When cost-based optimization starts, and the input tree is in a slightly different form, the CE is asked to compute the selectivity of the simpler predicates:

Begin selectivity computation

Input tree:

  LogOp_Select
      CStCollBaseTable(ID=1, CARD=356525 TBL: dbo.tblEvent AS TBL: e)
      ScaOp_Logical x_lopOr
          ScaOp_Comp x_cmpEq
              ScaOp_Identifier QCOL: [e].IntegrationEventStateId
              ScaOp_Const TI(int,ML=4) XVAR(int,Not Owned,Value=1)
          ScaOp_Comp x_cmpEq
              ScaOp_Identifier QCOL: [e].IntegrationEventStateId
              ScaOp_Const TI(int,ML=4) XVAR(int,Not Owned,Value=2)
          ScaOp_Comp x_cmpEq
              ScaOp_Identifier QCOL: [e].IntegrationEventStateId
              ScaOp_Const TI(int,ML=4) XVAR(int,Not Owned,Value=5)

Plan for computation:

  CSelCalcColumnInInterval
      Column: QCOL: [e].IntegrationEventStateId

Selectivity: 1

Stats collection generated: 

  CStCollFilter(ID=3, CARD=356525)
      CStCollBaseTable(ID=1, CARD=356525 TBL: dbo.tblEvent AS TBL: e)

End selectivity computation

This is the equivalent of:

SELECT *
FROM dbo.tblEvent AS TE 
WHERE TE.IntegrationEventStateId IN (1, 2, 5);

In both cases, the CE assesses that 100% of the rows will match, despite there being no histogram steps for the values 1, 2, or 5 (the sample data has values of 3 only). It is tempting to blame the CSelCalcColumnInInterval calculator for this, as it seems to treat {1, 2, 5} as a single interval {1:5}.

As is often the case, the "legacy" CE does a better (more detailed) job here, so you should find the following hint will produce much better plans:

OPTION (USE HINT ('FORCE_LEGACY_CARDINALITY_ESTIMATION'));

With the repro data, this produces a single seek and key lookup as one would hope.

"legacy" CE plan

Note that the seek performs four seeking operations, one for each disjoint predicate.

[1] Seek Keys[1]: Prefix: IntegrationEventStateId = 1
[2] Seek Keys[1]: Prefix: IntegrationEventStateId = 2
[3] Seek Keys[1]: Prefix: IntegrationEventStateId = 4, Start: ModifiedDateUtc >= dateadd(minute,(-5),getutcdate())
[4] Seek Keys[1]: Prefix: IntegrationEventStateId = 5

The new CE is designed to be more predictable, and easier to maintain/extend than the original CE. The "legacy" one had bits bolted on to it and refinements made over a long period of time. That complexity has benefits and pitfalls. Regressions and lower-quality estimates are somewhat expected with the newer CE. This ought to improve over time, but we are not there yet. I would view the behaviour shown here as a limitation of the calculator. Perhaps they will fix it.

See Optimizing Your Query Plans with the SQL Server 2014 Cardinality Estimator.


The question of why the plan shape depends on textual representation is more of a side issue. The compilation process does contain logic (e.g. rule SelPredNorm) to rewrite predicates into a normalized form, and both repro queries are successfully rewritten to the same tree. This is done for various internal purposes, including index and computed column matching, and to make logical simplification easier to process.

Unfortunately, the rewritten form is only used prior to cost-based optimization. The input to the cost-based optimizer retains the differences in textual order present in the original query. I believe this is intentional, and done to prevent unexpected plan changes. People sometimes write queries in slightly different, and unusual, ways to achieve a particular plan shape. If the optimizer suddenly started defeating those attempts as logically redundant, people would be upset. This is arguably less of an issue with things like query store, and more efficient plan forcing, but these are relatively recent innovations.

In other words, the plans are different because people have relied on different text producing different plans in the past, and changing that now would be too disruptive.

Proponent answered 30/3, 2020 at 10:5 Comment(0)
A
1

That is interesting and unusual, and I don't know why the QO recognizes that the seeking the index 4 times is going to be cheaper than the scan in only one of those queries.

Sometimes when you get strange plans the best solution is change the indexes and queries so that good plans are easier to get.

perhaps something like

CREATE TABLE dbo.tblEvent
(
   EventId                 INT IDENTITY PRIMARY KEY,
   IntegrationEventStateId INT,
   ModifiedDateUtc         DATETIME,
   OtherCol                CHAR(1),
   index IX_tblEvent_IntegrationEventStateId_ModifiedDateUtc(IntegrationEventStateId, ModifiedDateUtc) 
     include (OtherCol) 
     where IntegrationEventStateId in (1,2,4,5)
);

and then

select *
from dbo.tblEvent e 
where
    e.IntegrationEventStateId in (1,2,4,5)
    and (e.IntegrationEventStateId <> 4 or e.ModifiedDateUtc >= dateadd(minute, -5, getutcdate()))
Armchair answered 29/3, 2020 at 16:56 Comment(1)
Thanks @DavidBrowne-Microsoft, to remove "flakiness" I rewrote the query from multiple ors to multiple union alls.Coccidioidomycosis

© 2022 - 2024 — McMap. All rights reserved.