Checking variable for NULL kills performance
Asked Answered
B

2

8

I have following queries:

DECLARE @application_number CHAR(8)= '37832904';
SELECT
    la.LEASE_NUMBER AS lease_number,
    la.[LEASE_APPLICATION] AS application_number,
    tnu.[FOLLOWUP_CODE] AS note_type_code -- catch codes not in codes table
FROM [dbo].[lease_applications] la
LEFT JOIN [dbo].tickler_notes_uniq tnu ON tnu.[ACCOUNT_NUMBER] = la.[ACCOUNT_NUMBER]
WHERE la.LEASE_APPLICATION = @application_number
      OR @application_number IS NULL;



SELECT
    la.LEASE_NUMBER AS lease_number,
    la.[LEASE_APPLICATION] AS application_number,
    tnu.[FOLLOWUP_CODE] AS note_type_code -- catch codes not in codes table
FROM [dbo].[lease_applications] la
LEFT JOIN [dbo].tickler_notes_uniq tnu ON tnu.[ACCOUNT_NUMBER] = la.[ACCOUNT_NUMBER]
WHERE la.LEASE_APPLICATION = @application_number; 

The only difference between these 2 queries is that I've added checking for the variable if it is NULL or not.

The execution plans of these queries are: enter image description here

You can find graphical plan here

So the question is. Why the plans are so different?

UPDATE:

The actual execution plan of the first query can be found here

OPTION(RECOMPILE) changed the actual execution plan to the good one. However the downside of that is that my main goal was to create the TVF with these params and then everybody who uses that function is supposed to provide that option.

It is also worth to mention that my main goal is to create TVF with 2 params. Each of it might be null and might be not but at least 1 of them is supposed to be NOT NULL. These params are more or less equal, they are just different keys in the 2 tables that would give the same result anyway (the same number of rows and so on). That's why I wanted to do something like

WHERE (col1 = @param1 OR @param1 IS NULL) AND (col2 = @param2 OR @param2 IS NULL) AND (@param1 IS NOT NULL or @param2 IS NOT NULL)

So, basically I am not interested in ALL records at all

Beldam answered 6/9, 2017 at 7:29 Comment(12)
Because the first takes into account the possibility that @application_number is NULL, meaning the condition la.LEASE_APPLICATION = @application_number may not filter any rows at all. The second one knows it does.Geochronology
The param is provided and the value is set. How can it be null?Beldam
It isn't null in this case, but the optimizer must produce a plan that is valid for all possible cases. Try adding a WITH (RECOMPILE), which should at least skew the estimates in the right direction (but may not change the actual plan). OPTION (OPTIMIZE FOR (@application_number = '37832904')) is another possibiity. Again, the actual plan must still take the possibility into account that the value is null at runtime, when the cached plan is reused. OR @x IS NULL conditions are troublesome, which is why people often use IF and split the statements.Geochronology
OPTION (RECOMPILE) not WITH (RECOMPILE). Then SQL Server will recompile the statement on every invocation and not cache the plan - so it can optimize for the specific value. But you pay the compilation cost each time.Boxwood
@MartinSmith: right you are, I got confused with the WITH RECOMPILE option for stored procedures.Geochronology
brentozar.com/pastetheplanGustative
RECOMPILE does not change the plan. I understand your answers and they are logical. However it is very strange for me that this very common situation (adding OR @param IS NULL statement) can't be handled properly. So, does it mean, that I can not have function that would function normally with the cases when the param is provided and when it is not?Beldam
@DmitrijKultasev could you paste an actual execution plan of the "bad" query instead of the estimated one?Zhdanov
And you won't see the effect of OPTION (RECOMPILE) in the estimated plan. You will need to actually execute the queries and get the actual plan.Boxwood
I updated the questionBeldam
@DmitrijKultasev Simply don't use such "kitchen sink" queries. They are a well known bug. The only proper way to handle them is to avoid them. That only difference is huge. The same query plan has to be used either to match some records or return everything. The query optimizer can't predict the future. It will generate a plan based on the parameters your application provides the first time it calls the query.Jared
@DmitrijKultasev besides, such queries aren't even needed. ORMs can generate the proper conditions without resorting to kitchen sinks. Create a view to hide the join, field details and map/query the view instead of individual tablesJared
Z
7

You have two different plans for two different queries. It makes sense that when you have an equality condition on the WHERE clause(la.LEASE_APPLICATION = @application_number)(and having indexes in place) you get an index seek: working as expected!

On the other hand, when you write both conditions into one WHERE clause (la.LEASE_APPLICATION = @application_number OR @application_number IS NULL) the query optimizer has chosen to do a scan.

Even though the parameter value has been supplied and it is not null, the plan that is being used is the cached one and it can not know at compile time the actual value of your parameter.

This is the case if you have a stored procedure and you are calling it with parameters. This is not the case when executing a simple query using a variable. As @sepupic has stated, variable values do not get sniffed.

The plan is generated to handle both cases: when you have a value for your parameter as well as when you have none.

One option to fix your problem would be using OPTION(RECOMPILE) as it has been stated already in the comments. Another option would be to have your queries separated(for ex. having two different stored procedures, called by a third "wrapper" procedure), so that they get optimized accordingly, each one on it's own.

I would suggest you to take a look at this article by Kimberly L. Tripp: Building High Performance Stored Procedures and this other one by Aaron Bertrand: An Updated "Kitchen Sink" Example. I think these are the best articles explaining these kind of scenarios.

Both articles explain this situation, possible problems with it and possible solutions as well such as option(recompile), dynamic sql or having separated stored procedures.

Good luck!

Zhdanov answered 6/9, 2017 at 8:50 Comment(9)
The problem here is that I want to create TVF. Please read the UPDATEd section in my question. I am not interesting in scans at all as I will always use the key to get the record (either from param1 or param2)Beldam
@DmitrijKultasev you can specify option(recompile) at statement level. You could have it inside your function already defined and whoever is using your function does not need to write option(recompile).Zhdanov
>>>Even though the parameter value has been supplied and it is not null, the plan that is being used is the cached one and it can not know at compile time the actual value of your parameter.<<< That is not true. These queries do not use parameters (parameter is what you are passing in sp_executesql or in procedure) but use a variable. That is NEVER sniffed unless RECOMPILE option is used at the statement levelThay
@RigertaDemiri how to add with recompile to TVF definition (not stored procedure)?Beldam
@Thay that is true, you are right! Using a local variable actually is one of the possible workarounds for parameter sniffing but in this case I thought this was the piece of code inside a stored procedure. I will edit my answer to clarify.Zhdanov
@Zhdanov Demiri, if you look at the actual plan you'll see there is only runtime value for @application_number, if it was the plan created and cached previously you'd see there 'Compiled value null' but there isn't. So the plan was created without any idea of @application_numberThay
Dmitrij, apparently for TVFs you need to specify that option when calling. I was not aware of this. @sepupic, I have edited my answer and I think the information provided in it might be useful in the TVF scenario, so I will let it stay there. But I will follow up this post to learn from other answers! :)Zhdanov
@Zhdanov Demiri of course your info is useful for modules because in that case the plan is built based on the first sniffed parameter value, and without recompiling that cached plan will be used even for other parametersThay
@DmitrijKultasev a TVF isn't a stored procedure. It's replaced by its actual text when called. Which means you get no benefit from creating a kitchen sink TVF. Just don't do it this way. Write a view to hide the query's complexity and use the appropriate filters each time. ORMs like EF, Dapper make this trivialJared
T
3

Your queries do not use parameters, they use a variable. The variable is not sniffed at the moment the batch is compiled (compilation = making a plan) because the batch is seen as one whole thing. So server has no idea if the variable is null or is not null. And it must make a plan that will be suitable in both cases.

The first query can filter no rows at all so the scan is selected.

The second query does filter, but the value is unknown, so if you use SQL server 2014 and the fintered column is not unique, the estimation is C^3/4 (C= table cardinality)

The situation can be different if you use RECOMPILE query option. When you add it to your query, it's recompiled AFTER the assignment of table variable is done. In this case the variable value is known, and you'll get another plan. It will be a plan based on column statistics for a known value of your filter

Thay answered 6/9, 2017 at 8:57 Comment(2)
Whether the value of the variable is sniffed or not is irrelevant here. The issue is that SQL Server has to cache a plan that is valid for any parameter/variable value passed. If the OP was to use parameters (that are sniffed) they would still get the same issue that the addition of OR @application_number IS NULL causes a worse plan.Boxwood
@Martin Smith you are rigth. I just tried to explain what changes when the RECOMPILE option is used. The first query in the form it's written will lead to the same plan in case of parameter/variable just because it's not known if it will be null or not, so the plan must be valid for null tooThay

© 2022 - 2024 — McMap. All rights reserved.