"SELECT TOP 1 1" VS "IF EXISTS(SELECT 1"
Asked Answered
J

5

51

I have some .NET code that checks for the existence of a SQL record at a moderately-high interval. I am looking to make this check as "cheap" as possible.

I'm wondering the characteristics of two queries:

IF EXISTS(SELECT 1
          FROM   BigTable
          WHERE  SomeColumn = 200)
  SELECT 1 AS FOUND
ELSE
  SELECT 0 AS FOUND

VS

SELECT TOP 1 1
FROM   BigTable
WHERE  SomeColumn = 200 

They both produce similar execution plans. But the SELECT TOP 1 1 seems to execute faster: Less query to parse and when record is not found, it sends less down the pipe. I'm also assuming it runs faster at the client because I just need to check the Record Count, rather than marshaling the return value of IF EXISTS.

Most of the performance benefits are negligible. But if both consistently return the same result, then why not choose the slightly faster method?

Is "SELECT TOP 1 1" THEE best way to check for an existence of a record in .NET?

(We use .NET 3.5, and I am trying to avoid LINQ because it is not used elsewhere in the application. We also have some legacy VB6 apps that we are migrating/rewriting, so they may need to execute this as well.)

EDIT: Just a little more detail on design. This record is a "header". There is another table that has child records that will be read/parsed when this header is found. The lack of a record is a good thing: there is no work to do.

EDIT2: The lack of a record that meets the condition will occur more often. They come in sporadic waves.

Joviality answered 13/6, 2012 at 6:0 Comment(6)
It doesn't work when where there are no rows that meet conditions. Try SELECT TOP 1 1 FROM BigTable where 1 = 0 Peh
That's the point. I'm checking for the existence of a row that DOES meet the condition.Joviality
After "EDIT": Why are you checking existence of record instead of just trying to fetch single record matching your criteria? If fetch (select) returns no records, you have no work to do and it's not much more expensive then only checking existence of such record. If it does return record, you have your header data loaded and just saved one database query.Philippe
That's a great question. Because I don't use the header data. I get all the child records based on a flag/state, not an ID. It's a third-party database I'm reading from.Joviality
thanks for the question, I browse a lot of sites just for understand the existing code select top 1 1, really, the readability is very important for maintenance and other people that reads your code...Weisler
NB: There's a good write up on this here: sqlservercentral.com/articles/T-SQL/154149Subphylum
B
43

I'd recommend IF EXISTS(SELECT * ...), unless this is actually causing a performance issue. It expresses the intent of the query in a much better understood fashion than alternatives.

I'd avoid COUNT(*) (as in the current answers) unless you actually need the count of rows from the table.

If you want the "efficiency" of checking the rowcount from the result, I'd probably go for:

select 1 where exists(select * from BigTable where SomeColumn=200)

Which produces the same result set as your second query (either 0 or 1 row)

Broadside answered 13/6, 2012 at 6:14 Comment(6)
Absolutely agree. It sounds obvious, but making it clear to the optimizer what we are trying to do (check whether something exists) is usually the first step in getting sane performance.Coarsegrained
This looks great! But "SELECT TOP 1 1" still is slightly faster on the actual execution plan. Maybe if this was in a procedure?Joviality
@EricSwanson - that may come down to all kinds of minutiae in the setup of your server (in terms of hardware, software and usage, how big the actual table is, what indexes are available, etc). I just ran a few tests against a moderately sized table and could not split the difference between them.Broadside
I'm just wondering why this is faster... less query text sent to the server? (which is why I am suggesting a stored procedure with a short name)Joviality
@EricSwanson - both texts are short enough that the should only occupy a single packet sent over the network, so the network overheads should swamp the physical size of the queries.Broadside
Alright, thanks for doing the test! I think I'm just splitting hairs now for minor improvements... I rather use your query because it makes more sense for what it does when you read it.Joviality
P
17

This is what you want instead of the IF statement

  SELECT ISNULL(
     (SELECT TOP 1 1 FROM BigTable where SomeCol = 200), 0);
Peh answered 13/6, 2012 at 6:12 Comment(2)
I find this the most elegant solutionMarxismleninism
This does not convey intent that easily or elegantly as if exists.Londonderry
S
8

Makes no difference at all, exists will not even evaluate the select portion of your statement. So, use the one you like.

declare @test table (name varchar(20))

-- comment out inserts for testing.
insert into @test (name) values ('bob the builder')
insert into @test (name) values ('bob the builder')

-- for testing, put 1/0 here. You'll find that divide by zero error.
select 1 from @test

-- notice that you don't receive a divide by zero error.
-- the bit in the select portion is never executed, ever.
if not exists (select 1/0 from @test) select 'Yay!'
if     exists (select 1/0 from @test) select 'Boo!'

In fact you can use these interchangeably:

... select * ... 
... select 1 ... 
... select top 1 * ... 
... select top 1 1 ... 
... select 'John Jacob Jingleheimer Schmidt' ... 
Sharie answered 6/2, 2019 at 20:21 Comment(0)
T
2

I would definitely go for the 2nd option:

SELECT TOP 1 1
FROM   BigTable
WHERE  SomeColumn = 200 

The execution plan is simpler and efficient even when I/O and CPU numbers are mostly the same.

Thatcher answered 30/9, 2013 at 21:17 Comment(0)
A
-16

Not sure about SQL Server but in other databases the standard method is:

SELECT COUNT(*) FROM BigTable where SomeCol = 200;

Actual efficiency depends on your indexes etc.

Asteria answered 13/6, 2012 at 6:6 Comment(3)
If you only care about existence/non-existence of rows, I'd explicitly warn against using COUNT(*), since it has to process the entire table to compute the result (whereas TOP 1 or EXISTS can stop after a single matching row has been found)Broadside
Agree with @Broadside - you should never use a "count" if you only care about "exists" - that causes extra workCoarsegrained
Count is bad, count(*) is even worse.. Even if you have no choice but to use count, at least use count(1).Messer

© 2022 - 2024 — McMap. All rights reserved.