"<>" vs "NOT IN"
Asked Answered
P

10

25

I was debugging a stored procedure the other day and found some logic something like this:

SELECT something
FROM someTable
WHERE idcode <> (SELECT ids FROM tmpIdTable)

This returned nothing. I thought it looked a little odd with the "<>" so I changed it to "NOT IN" and then everything worked fine. I was wondering why this is? This is a pretty old proc and I am not really sure how long the issue has been around, but we recently switched from SQL Server 2005 to SQL Server 2008 when this was discovered. What is the real difference between "<>" and "NOT IN" and has the behavior changed between Server2005 and 2008?

Pelargonium answered 13/5, 2009 at 14:38 Comment(0)
P
26
SELECT something
FROM someTable
WHERE idcode NOT IN (SELECT ids FROM tmpIdTable)

checks against any value in the list.

However, the NOT IN is not NULL-tolerant. If the sub-query returned a set of values that contained NULL, no records would be returned at all. (This is because internally the NOT IN is optimized to idcode <> 'foo' AND idcode <> 'bar' AND idcode <> NULL etc., which will always fail because any comparison to NULL yields UNKNOWN, preventing the whole expression from ever becoming TRUE.)

A nicer, NULL-tolerant variant would be this:

SELECT something
FROM someTable
WHERE NOT EXISTS (SELECT ids FROM tmpIdTable WHERE ids = someTable.idcode)

EDIT: I initially assumed that this:

SELECT something
FROM someTable
WHERE idcode <> (SELECT ids FROM tmpIdTable)

would check against the first value only. It turns out that this assumption is wrong at least for SQL Server, where it actually triggers his error:

Msg 512, Level 16, State 1, Line 1
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
Pogrom answered 13/5, 2009 at 14:43 Comment(1)
WHERE idcode NOT IN (...) is equivalent to WHERE idcode <> ALL (...)Kong
R
14

try this, may run faster because of index usage:

SELECT something
FROM someTable
    LEFT OUTER JOIN tmpIdTable ON idcode=ids
WHERE ids IS NULL
Ruwenzori answered 13/5, 2009 at 14:45 Comment(1)
SQL Server normally (I haven't seen a case when it hasn't) produces identical QEP's for a join or a sub-query, and therefore identical performance.Tolly
O
10

<> is a "singular" NOT operation; NOT IN is a set operation, so it makes sense that the former wouldn't work. I have no idea whether or not it may have done so under a previous version of SQL Server, however.

Orgel answered 13/5, 2009 at 14:40 Comment(0)
N
5

This code is valid if and only if there are no rows or a single row returned from tmpIdTable:

SELECT something
FROM someTable
WHERE idcode <> (SELECT ids FROM tmpIdTable)

If multiple rows are returned, you will get an error like:

Msg 512, Level 16, State 1, Line 1 Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.

This is the same error you get with nested scalar unexpectedly produces multiple rows like:

SELECT *, (SELECT blah FROM t1 WHERE etc.) FROM t2

Nothing has changed WRT this in SQL Server in a decade, so I expect assumptions about the nested query in the original code have been broken.

If no rows are returned, the result will be empty since <> NULL is never true (assume ANSI NULLs).

This code is valid for any number of rows:

SELECT something
FROM someTable
WHERE idcode NOT IN (SELECT ids FROM tmpIdTable)

However, there still can be issues with NULL.

Navar answered 13/5, 2009 at 15:48 Comment(3)
I actually had this same conversation this morning with a coworker. Its similar, but not quite the answer.Pelargonium
Conceptually, at least, it's even wrong if the subselect returns zero or one row. Because you're asking if a scalar, idcode, is equal to a zero- or one- element list. The scalar is never actually equal to the list, even if it's equal to the only element in a one-element list.Coessential
Indeed, I never use the construct and I would typically think of it as a code smell.Navar
H
3

I have no idea why would you write something like WHERE idcode <> (SELECT ids FROM tmpIdTable). A SELECT statement will return a set of tuples and your idcode either will or will NOT be IN this set. "WHERE idcode NOT IN (SELECT ids FROM tmpIdTable)" is the way to do it.

Hotheaded answered 13/5, 2009 at 14:45 Comment(5)
Thats why i knew to fix it, I just didn't really know exactly what the logic behind it was.Pelargonium
The <> statement may make sense if the sub-query is ordered in some way, e.g. "<> the largest [thing] in the list".Pogrom
You're right, in case the sub-query is ordered the <> notation could make some sense, but I would avoid it anyhow, as it's prone to error -- easy to overlook its intent. Whoever removes or changes the ordering of the sub-query later on may be baffled by why the query returns nothing.Misha
True. For disambiguation I would always put in a TOP 1 or MAX()/GROUP BY or similar in the sub-query. Just using <> against a set of values is too much side-effect dependent for my share.Pogrom
Hm... To err is human, I guess. :-\ I really thought that <> (SELECT ...) was valid, but at least SQL Server does not accept it when the sub-query returns more than a single value. It's no good practice anyway.Pogrom
S
3

If the SELECT subquery returns zero rows, that's a NULL. When NULL is compared to anything, the result is always UNKNOWN, and never TRUE. Confusingly enough, NOT UNKNOWN is equal to UNKNOWN.

I avoid three valued logic (TRUE, FALSE, UNKNOWN) whenever possible. It's not that hard to avoid once you get the hang of it.

If the SELECT subquery returns exactly one value, the comparison for inequality should return the result you expect.

If the SELECT subquery returns more than one value, you should get an error.

In general, NOT IN will return the result you expect when you are testing for non membership in a set.

This response overlaps other responses, but it's phrased a little differently.

Edited to add more detail about NOT IN:

I did some searching about NOT IN in Oracle, and I learned something I didn't know a half an hour ago. NOT IN is NULL sensitive. In particular,

X NOT IN (SELECT ...)

Is not the same as

NOT (X IN SELECT ...))

I may have to amend my earlier response!

Sisak answered 13/5, 2009 at 16:1 Comment(0)
S
0

When you say NOT IN, it's the opposite of IN where you are looking for any of the values to be included or not included independently (separated by a comma). However, you will have issues with when null values are included. As for looking for values not null, you can literally indicate not null in your where statement. If you combine the criteria of not null AND whatever other criteria, I suspect you'd have more success. NOT EXISTS is the generally recommended approach and overall generally good practice. It handles null scenarios much better.

Select table.column 
From Table
Where table.column is not null 
AND table.column NOT Exists ('value1','value2','value3')

Or you could change the Not Exists to a subquery like

Select table.column 
From Table
Where table.column is not null
AND table.column NOT Exists (Select column.value from tmpIDtable)

table.column = the table identifier with the name of the column.

A good explanation on this topic is located in the responses here

Seamus answered 8/11, 2023 at 16:12 Comment(0)
C
-1

in some versions of SQL != should be used for a "not equals" logical statement. Have you tried that?

Candis answered 13/5, 2009 at 14:51 Comment(1)
'<>' and '!=' are equivalent on SQL Server. There is no version that insists on '!='. But '<>' is the ANSI SQL standard way of doing it, though personally I tend to use != more often for some reason.Pogrom
C
-2

<> is essentially "nor"

on SQL WHERE NOT demo I tried the following

FAILS:

WHERE NOT Country='Mexico' AND WHERE NOT country ='Sweden'

WORKS

WHERE NOT Country='Mexico' AND country<>'Sweden'
Carin answered 28/7, 2020 at 23:36 Comment(3)
This is not answering the question. It's asking for differences between <> and NOT IN when used in a subquery. NOT and NOT IN aren't the same things. Additionally you cannot use WHERE twice like you did in your first where-clause.Inscrutable
The question is about NOT IN, the answer is about NOT.Rossetti
Pretty sure <> is NOT whereas NOT IN is nor. If I do WHERE status <> 'OK' I am saying where the status is not ok.Redford

© 2022 - 2024 — McMap. All rights reserved.