SQL Query execution shortcut OR logic?
Asked Answered
M

6

7

I have three tables:

SmallTable
   (id int, flag1 bit, flag2 bit)
JoinTable
   (SmallTableID int, BigTableID int)
BigTable
   (id int, text1 nvarchar(100), otherstuff...)

SmallTable has, at most, a few dozen records. BigTable has a few million, and is actually a view that UNIONS a table in this database with a table in another database on the same server.

Here's the join logic:

SELECT * FROM
    SmallTable s
    INNER JOIN JoinTable j ON j.SmallTableID = s.ID
    INNER JOIN BigTable b ON b.ID = j.BigTableID
WHERE
    (s.flag1=1 OR b.text1 NOT LIKE 'pattern1%')
    AND (s.flag2=1 OR b.text1 <> 'value1')

Average joined size is a few thousand results. Everything shown is indexed.

For most SmallTable records, flag1 and flag2 are set to 1, so there's really no need to even access the index on BigTable.text1, but SQL Server does anyway, leading to a costly Indexed Scan and Nested Loop.

Is there a better way to hint to SQL Server that, if flag1 and flag2 are both set to 1, it shouldn't even bother looking at text1?

Actually, if I can avoid the join to BigTable completely in these cases (JoinTable is managed, so this wouldn't create an issue), that would make this key query even faster.

Munroe answered 25/1, 2010 at 21:5 Comment(2)
+1 for interesting question. Hoping to learn more from this myself!Ekg
You mentioned an index scan on BigTable which is a view. Is it an indexed view, or the index scan is performed on the underlying tables? Could you please post the query plan here?Condillac
H
5

SQL Boolean evaluation does NOT guarantee operator short-circuit. See On SQL Server boolean operator short-circuit for a clear example showing how assuming operator short circuit can lead to correctness issues and run-time errors.

On the other hand the very example in my link shows what does work for SQL Server: providing an access path that SQL can use. So, as with all SQL performance problems and questions, the real problem is not in the way the SQL text is expressed, but in the design of your storage. Ie. what indexes has the query optimizer at its disposal to satisfy your query?

Hecklau answered 25/1, 2010 at 21:24 Comment(3)
I agree--this isn't a short-cutting issue per se, but a problem where the query engine sends off work to check the value of text1 even with both flags are set to 1. It's an indexed operation, but unnecessary if all selected records in SmallTable have those flags set. As I commented elsewhere, I think it's a problem with the query optimizer not being "smart enough" to avoid the branch of work on BigTable.text1 where all of the records in SmallTable don't require the text comparisons.Munroe
Unfortunately there are no procedural/conditional query tree operators. In other words, there is no operators that says 'if condition is true go down this path else this other path'. The query optimizare has to create a plan that satisfies all possible conditions, even those that have a very low probability. The query plan can do lots of tricks when ther are deterministic conditions, eg. two tables in a join with a trusted foreign relation may completely eliminate one table from the plan. This is where all T-SQL tricks come into picture, like use UNION instead of OR.Hecklau
That's a very good comment. I was struggling to word my understanding but this IMHO is a great explanation.Ekg
E
1

I don't believe SQL Server will short-circuit conditions like that unfortunately.

SO I'd suggest doing 2 queries and UNION them together. First query with s.flag1=1 and s.flag2=1 WHERE conditions, and the second query doing the join on to BigTable with the s.flag1<>1 a s.flag2<>1 conditions.

This article on the matter is worth a read, and includes the bottom line:

...SQL Server does not do short-circuiting like it is done in other programming languages and there's nothing you can do to force it to.

Update:
This article is also an interesting read and contains some good links on this topic, including a technet chat with the development manager for the SQL Server Query Processor team which briefly mentions that the optimizer does allow short-circuit evaluation. The overall impression I get from various articles is "yes, the optimizer can spot the opportunity to short circuit but you shouldn't rely on it and you can't force it". Hence, I think the UNION approach may be your best bet. If it's not coming up with a plan that takes advantage of an opportunity to short cut, that would be down to the cost-based optimizer thinking it's found a reasonable plan that does not do it (this would be down to indexes, statistics etc).

Ekg answered 25/1, 2010 at 21:17 Comment(5)
Good comment, but while SQL Server doesn't shortcut expression evaluation, it does do query optimization. So, if my query is limited to a result set of SmallTable where both flags are 1, it should be smart enough to skip the check of BigTable.text1, but that would require changing the execution plan based on the data. I think that's the core problem.Munroe
Yes, I see what you're saying but I still see it as shortcutting - the optimizer would need to be able to see the opportunity for short-circuiting. I've done some tests myself in the past on this kind of thing, and that's what I've seen (no optimisation in this manner).Ekg
The problem is that for an or it can evaluate in any order and it is doing the right hand side first.Hematology
@Hogan, I think the actual issue is that SQL Server always does both sides and merges them, not realizing that the left side may always return a "1" and thus make the right side check superfluous. It's a limitation of having a single, non-procedural execution plan for a query rather than a procedural check that could lop of parts of the query execution path at runtime. Even UNION performs all sides of the UNION, so I don't think it can escape here either. A stored procedure with IF...ELSE...END logic may be the only answer.Munroe
Just a word of caution here since this isn't the first time I see the Nigel Ellis chat quoted: the chat transcript was published to TechNet on November 19, 2001. The QO had gone through significant changes in 2005 release, and more changes in 2008 as well. As a general rule, information from Craig's blog is significantly more up to date blogs.msdn.com/craigfr.Hecklau
H
0

No idea if this will be faster without test data... but it sounds like it might

SELECT * FROM
    SmallTable s
    INNER JOIN JoinTable j ON j.SmallTableID = s.ID
    INNER JOIN BigTable b ON b.ID = j.BigTableID
WHERE
    (s.flag1=1) AND (s.flag2=1)
 UNION ALL
 SELECT * FROM
    SmallTable s
    INNER JOIN JoinTable j ON j.SmallTableID = s.ID
    INNER JOIN BigTable b ON b.ID = j.BigTableID
WHERE
    (s.flag1=0 AND b.text1 NOT LIKE 'pattern1%')
    AND (s.flag2=0 AND b.text1 <> 'value1')

Please let me know what happens

Also, you might be able to speed this up by just returning just a unique id for this query and then using the result of that to get all the rest of the data.

edit

something like this?

SELECT * FROM
    SmallTable s
    INNER JOIN JoinTable j ON j.SmallTableID = s.ID
    INNER JOIN BigTable b ON b.ID = j.BigTableID
WHERE
    (s.flag1=1) AND (s.flag2=1)
 UNION ALL
 SELECT * FROM
    SmallTable s
    INNER JOIN JoinTable j ON j.SmallTableID = s.ID
    INNER JOIN BigTable b ON b.ID = j.BigTableID
WHERE EXISTS
    (SELECT 1 from BigTable b
     WHERE   
    (s.flag1=0 AND b.text1 NOT LIKE 'pattern1%')
    AND (s.flag2=0 AND b.text1 <> 'value1')
)
Hematology answered 25/1, 2010 at 21:15 Comment(2)
Tried this, it was actually much slower when either flag was set to 1.Munroe
I think there might we a way to do this with extra joins and a force order, but my brain is to fuzzy to think of it right now.Hematology
A
0

It's not elegant, but it should work...

SELECT * FROM
    SmallTable s
    INNER JOIN JoinTable j ON j.SmallTableID = s.ID
    INNER JOIN BigTable b ON b.ID = j.BigTableID
WHERE
    (s.flag1 = 1 and s.flag2 = 1) OR 
    (
       (s.flag1=1 OR b.text1 NOT LIKE 'pattern1%')
       AND (s.flag2=1 OR b.text1 <> 'value1')
    )
Attain answered 25/1, 2010 at 21:16 Comment(1)
Thanks... tried this, but the execution was identical from what I could tell.Munroe
S
0

SQL Server usually grabs the subquery hint (though it's free to discard it):

SELECT      * 
FROM        (
            SELECT * FROM SmallTable where flag1 <> 1 or flag2 <> 1
            ) s
INNER JOIN  JoinTable j ON j.SmallTableID = s.ID
...
Spaceman answered 25/1, 2010 at 21:24 Comment(0)
J
0

Hope this works - careful of shortcut logic in case statements around aggregates but...

SELECT * FROM
    SmallTable s
    INNER JOIN JoinTable j ON j.SmallTableID = s.ID
    INNER JOIN BigTable b ON b.ID = j.BigTableID
WHERE 1=case when (s.flag1 = 1 and s.flag2 = 1) then 1
when (
       (s.flag1=1 OR b.text1 NOT LIKE 'pattern1%')
       AND (s.flag2=1 OR b.text1 <> 'value1')
    ) then 1
else 0 end
Jacquelynnjacquenetta answered 14/6, 2019 at 12:10 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.