Why does NULL = NULL evaluate to false in SQL server
Asked Answered
G

21

183

In SQL server if you have nullParam=NULL in a where clause, it always evaluates to false. This is counterintuitive and has caused me many errors. I do understand the IS NULL and IS NOT NULL keywords are the correct way to do it. But why does SQL server behave this way?

Gobelin answered 3/12, 2009 at 22:28 Comment(13)
I don't have a sister, and nor does my friend. If "NULL = NULL" then we have a common sister, and are therefore related! :)Popery
It's defined SQL rather than SQL Server specific. PostgreSQL let's you change the behaviour to a degree: postgresql.org/docs/8.0/interactive/…Inocenciainoculable
There is a long running controversy on SQL NULLs (see for example: en.wikipedia.org/wiki/Null_%28SQL%29#Controversy and firstsql.com/inulls.htm). The specific point here is that equality is a long established mathematical concept and SQL violates it - equality is reflexive: for each x, x = x. That must be always true, otherwise one is introducing an interpretation of equality which is not standard and confusion is the obvious outcome.Finely
It doesn't violate mathematics at all. I'm thinking of two numbers. I'm not going to tell you what they are though. So now you tell me, are they equal?Tace
@Tom H.: read my answer, I don't want to repeat myself here.Finely
@MaD70, it doesn't violate the concept of equality because NULL is not a value. So "for each x, x=x" is still true because NULL is not part of the domain of x. Exactly the same issue arises with NaN in IEEE-754 floating point.Lunetta
@Mark Thornton: I have explained abundantly on what grounds I reject 3VL and NULLs in SQL and why FALSE is wrong in this case. But of course you don't care to read my answer, as other people.Finely
@Matt, I don't agree with your analogy. NULL = NULL wouldn't mean you have a common sister, it would mean you both lack a sister.Oilcloth
@manu08 No, the current implementation (that NULL is never equal to NULL) means that we both lack a sister, which was my point.Popery
@MattHamilton : I disagree with your interpretation of what the current implementation means. Consider the "set of my sisters" and the "set of your sisters". If those are both empty sets, then it is true that we both have the same set of sisters. In some uses of NULL, including the example you give, this would make True a better response to NULL = NULL.Symploce
The ANSI null standard is wrong. Null = Null should be true. null not like '%a%' should be true, because the known absence of a pattern CANNOT POSSIBLY equal a known pattern. That's not an unknown, that's a definitive 'false'. The standard already causes confusion because it violates common set theory that the empty set is equal to itself. It violates logic by treating something false as unknown, which results in logical inverses being treated as equivalent (e.g. null not like '%a%' results in the same value as null like '%a%'. It's TERRIBLE that this is becoming a permanent feature.Tuneless
The intermediate 'ternary' logic it attempts to use is bull, because in the end you must either get the result you want or not. The row must be included or not. The Boolean expression must evaluate to true or not. With the ANSI null standard, we get 'false' much more often than expected, in situations where we'd definitely not expect it. For example, where null not like '%a%' or null like '%a%', essentially a or !a, should always be true. Not so with the ANSI standard. A clearly Boolean expression ends up a total contradiction in the ANSI standard. So much stuff is going to break.Tuneless
Seriously, when is the last time you wrote a where clause and SQL Server returned a result set with maybe rows. They're either there or not. Interjecting ternary logic in a fundamentally Boolean space is asinine. End of story.Tuneless
L
233

Think of the null as "unknown" in that case (or "does not exist"). In either of those cases, you can't say that they are equal, because you don't know the value of either of them. So, null=null evaluates to not true (false or null, depending on your system), because you don't know the values to say that they ARE equal. This behavior is defined in the ANSI SQL-92 standard.

EDIT: This depends on your ansi_nulls setting. if you have ANSI_NULLS off, this WILL evaluate to true. Run the following code for an example...

set ansi_nulls off

if null = null
    print 'true'
else
    print 'false'


set ansi_nulls ON

if null = null
    print 'true'
else
    print 'false'
Leontina answered 3/12, 2009 at 22:30 Comment(28)
x = x only holds true when x is a known value. NULL is a textual representation of an unknown value. If you have two unknown values, you can't conclusively state anything about their equality. I believe that to have also held true for a few centuries.Mcgriff
@Dewayne Christensen: show me these centuries old mathematical/logical treatment of a concept equivalent to SQL NULLs that violates the reflexive property of equality. I'm not aware of any, so this will be a very interesting study for me. Or is these simply "common sense"?Finely
@Scott Ivey: yes, I disagree with such nonsense. The fact that such nonsense is in a standard is an aggravating factor. It says everything about the state of this field (IT).Finely
@Scott Ivey: If you had explained the reason given by the standard specifying that it's nonsense (and why it is nonsense), then there wouldn't have been my downvote.Finely
Since it's December, let's use a seasonal example. I have two presents under the tree. Now, you tell me if I got two of the same thing or not.Mcgriff
@Dewayne Christensen: hello? It is SQL NULL to be a questionable concept, not reflexivity of equality to be in doubt. My position is that is better to not use NULLs at all, there are design to represent missing information.Finely
My answer will hopefully clarify what I was struggling to express.Finely
@MaD70: reflexivity is a property which is not take for granted at all. There's another case where equality fails to evaluate, and this is perfectly accepted: is infinity = infinity true or false? The fact is that NULL is not a value, exactly as infinity is not a real number.Dallman
SQL NULL is not any different from IEEE floating point NaN, where you also have (NaN == NaN) == false && (NaN != Nan) == false && (NaN < NaN) == false && ... - because, well, if it's not a number, you just can't say much about it; it's something unknown. The concept is sound, even if unintuitive to people who have never seen it before.Yirinec
There's no violation of reflexivity here, because NULL is not a member of the set of values (domain, in relational terms). NULL is not a value. It's a placeholder for the value which is unknown.Yirinec
To put it in other words, every NULL in an SQL expression can be treated as a distinct mathematical variable. So an expression NULL = NULL should be treated as x = y, where x and y are unbound variables. Now if someone asks you, what is the value of x = y? The only reasonable answer is, "some z". So we have (x = y) = z - or, transcribing it back to SQL, (NULL = NULL) = NULL.Yirinec
This shows exactly what I was speaking about: confusion introduced just to save a questionable concept. So, what NULL really is? I read what you all wrote: it is not a value but one assigns it to variables. No, "NULL can be treated as a distinct mathematical variable". No one is reading my answer also, where I say that yes, the correct answer is (NULL = NULL) = NULL in 3VL but explain why I reject it. Also, what do you think about a DBMS which returns FALSE as value of NULL = NULL? This answer does not object to such DBMSes; he wrote: "..false or null, depending on your system.."Finely
It's more along the lines of "depending on configuration of one's system". ANSI SQL standard is unambiguous on this (it's 3VL), but for legacy reasons implementations often can't just stick to it and offer no choice on the patter. Hence ANSI_NULLS and other such cruft. In any case, you can configure MSSQL to treat NULLs properly.Yirinec
By the way, I didn't write "one assigns it to variables". I said that "it should be treated as a distinct unbound variable", which is a world of difference.Yirinec
w.r.t. infinity is the classical "can of worms", which I don't intend to open: ask professional mathematicians how much it is a well settled concept and when it was systematized acceptably (I seems to vaguely remember that both infinity and infinitesimal have a long and tortuous story).Finely
@Pavel Minaev: I never attributed that to you, I cite you in double quotes exactly. But the reality is that they are assigned to variables: as part of relation values to relation variables. So they are considered values.Finely
@Stefano Borini: w.r.t. infinity again - IIRC, division by zero, in mathematics, is an error, does not gives infinite. There are operators like limit and integral that "accept" and "return" infinite.Finely
@MaD70: "what do you think about a DBMS which returns FALSE as value of NULL = NULL?" I think it's return value is correct. Two things of unknown value may or may not contain something similar; since the value of both is something not known, there's no way to compare them. (See Dewayne Christensen's comment above about the Christmas presents for a more literal example.) @Dewayne: Nice!Asare
@Ken White: you are wrong. Correct answer in 3VL is NULL (see my answer). This reinforce my position - 3VL is counterintuitive, NULLs in SQL are a mess.Finely
No. I don't care how many times you say it, NULL = NULL is FALSE. Just like 1 - 0 is false, except you're comparing two unlike and unknown objects. If they can't be considered equal, then equality is FALSE. "If A is exactly the same as B, return TRUE, ELSE return FALSE". Is that logic a little more clear to you? If they AREN'T EQUAL, then the equality comparison MUST return false. I'm not sure why you're having such a hard time grasping that concept, I'm afraid.Asare
@Ken White: ok, stay happy in your ignorance. Good luck.Finely
@Pavel Minaev: I know this is an old discussion, but MaD70 makes some valid points that are being ignored. If I give you two boxes, the contents of which are unknown to you, and ask, "Is the thing in Box A the same as the thing in Box B?" and you say, "No", that is a problem. The fact is, you don't know whether they are the same or not so you cannot validly say "yes" or "no". It's similar to division by zero: I'm asking a question that cannot be answered. Defaulting the answer to "false", is pure stupidity (or retardation, whichever you prefer). MaD70 is right about that.Distilled
@Finely On the mathematics front alone and no comment on whether nulls are intuitive, there are many math constructs that drop properties we take for granted. With string concatenation, A||B <> B||A and in linear algebra matrix multiplication doesn't generally commute. The whole concept of non-Euclidean geometry, which has practical value in relativity theory, required a difficult rejection of the natural approach to parallel lines and the fifth postulate. I believe partial equivalence relation covers the idea we're talking about now.Ia
Note: Ansi_null only works in some casesNonappearance
Silly arguments above. In logic, as well as common sense, the correct answer to "unknown1 = unknown2" is NEITHER True nor False. It is "I don't know". The flaw here is trying to stuff "I don't know" into "True or False?". SQL implementors presumably decided it was safer to say "False" than to say "True". Maybe, but as MaD70 points out in his answer, this is still a logic flaw. The result should be "NULL" (which represents "I don't know"). That is, one should PROPAGATE the LACK OF KNOWLEDGE up to enclosing queries.Symploce
"If you have two unknown values, you can't conclusively state anything about their equality." << Yes, you absolutely can, because we're not talking about unknown values, we're talking about a known value, null, which represents an unknown/unspecified value. A symbolic digital system CANNOT store an unknown value. It can only store known symbols, which may represent, anything, known or unknown. By pretending that it can process actual unknown values, and not just known symbols that represent the concept, the standard just introduces faulty logic. a or !a should always be true.Tuneless
SQL Server setting ANSI nulls on permanently is a NIGHTMARE scenario. Not only does it make it impossible to write sane and reliable Boolean statements, it's going to force people to rewrite systems, and force everyone to write low-performance queries littered with ISNULL checks, when a simple = null would suffice. It's crazy. The answer to the question of whether two null fields are equal is not for SQL Server to determine, and introduce unnecessarily complex ternary logic in the process; it's up to the writer of the query to determine what null = null means, but null = null is true.Tuneless
Similar to how infinity != infinity :)Bobbyebobbysocks
T
156

How old is Frank? I don't know (null).

How old is Shirley? I don't know (null).

Are Frank and Shirley the same age?

Correct answer should be "I don't know" (null), not "no", as Frank and Shirley might be the same age, we simply don't know.

Tesstessa answered 1/2, 2014 at 23:39 Comment(12)
I disagree that null signifies "unknown". What it actually means is "no data". That might be used to represent the case where information is not known, but it is actually more likely to be used to indicate that something doesn't exist. To continue your example: What is Frank's middle name? He doesn't have one (null). What is Shirley's middle name? She doesn't have one (null). Do Frank and Shirley have the same middle name? Yes? No? Don't know? I can see an argument for "no", and I can see an argument for "don't know", but there's no real argument for "yes" unless you're being overly literal.Acclaim
@richiban I disagree. The lack of existence of a row means 'no data'Tesstessa
@NeilMcGuigan That's true if for data that has its own table, but what about data that's represented in a column? Wouldn't you use 'null' to represent the fact that the data doesn't exist? 'Not known' is a very specific reason for the data being absent.Acclaim
But null = null yields FALSE, not NULL.Ragan
@Ragan I agree with you, however that is incorrectTesstessa
@Acclaim Your question is a clear yes. Frank does not have a middle name. Shirley does not have a middle name. Therefore they do have the same middle name.Rocaille
@MichaelTsang No, Frank does not have a middle name, therefore he does not have the same middle name as anyone. Or, more correctly, for Frank the question doesn't even make sense.Acclaim
I don't know (null) = I don't know (null). If table is a table, then table = table. Sql has no logic here.Keeling
@Acclaim if it is known that Frank has no middle name, the correct value to store in the database is '' (the empty string).Nichellenichol
@Acclaim I like your analogy much better and tbh what is programming if not being "overly literal?" 😜 In the programming languages I commonly use, null == null is true, which just feels right 😎Rete
@Rete Maybe 'overly literal' doesn't quite get my point across; it might be closer to say that if you think 'null equals null' you're playing word games, the same way you would be to say that 'nothing equals nothing'. If you treat nothing/null as a value like any other then you quickly descend into logical absurdities such as in the old riddle "if a ham sandwich is better than nothing and nothing is better than God, we can conclude that a ham sandwich is better than God"Acclaim
Shirley you can't be serious?Melamie
F
31

Here I will hopefully clarify my position.

That NULL = NULL evaluate to FALSE is wrong. Hacker and Mister correctly answered NULL. Here is why. Dewayne Christensen wrote to me, in a comment to Scott Ivey:

Since it's December, let's use a seasonal example. I have two presents under the tree. Now, you tell me if I got two of the same thing or not.

They can be different or they can be equal, you don't know until one open both presents. Who knows? You invited two people that don't know each other and both have done to you the same gift - rare, but not impossible §.

So the question: are these two UNKNOWN presents the same (equal, =)? The correct answer is: UNKNOWN (i.e. NULL).

This example was intended to demonstrate that "..(false or null, depending on your system).." is a correct answer - it is not, only NULL is correct in 3VL (or is ok for you to accept a system which gives wrong answers?)

A correct answer to this question must emphasize this two points:

  • three-valued logic (3VL) is counterintuitive (see countless other questions on this subject on Stackoverflow and in other forum to make sure);
  • SQL-based DBMSes often do not respect even 3VL, they give wrong answers sometimes (as, the original poster assert, SQL Server do in this case).

So I reiterate: SQL does not any good forcing one to interpret the reflexive property of equality, which state that:

for any x, x = x §§ (in plain English: whatever the universe of discourse, a "thing" is always equal to itself).

.. in a 3VL (TRUE, FALSE, NULL). The expectation of people would conform to 2VL (TRUE, FALSE, which even in SQL is valid for all other values), i.e. x = x always evaluate to TRUE, for any possible value of x - no exceptions.

Note also that NULLs are valid " non-values " (as their apologists pretend them to be) which one can assign as attribute values(??) as part of relation variables. So they are acceptable values of every type (domain), not only of the type of logical expressions.

And this was my point: NULL, as value, is a "strange beast". Without euphemism, I prefer to say: nonsense.

I think that this formulation is much more clear and less debatable - sorry for my poor English proficiency.

This is only one of the problems of NULLs. Better to avoid them entirely, when possible.

§ we are concerned about values here, so the fact that the two presents are always two different physical objects are not a valid objection; if you are not convinced I'm sorry, it is not this the place to explain the difference between value and "object" semantics (Relational Algebra has value semantics from the start - see Codd's information principle; I think that some SQL DBMS implementors don't even care about a common semantics).

§§ to my knowledge, this is an axiom accepted (in a form or another, but always interpreted in a 2VL) since antiquity and that exactly because is so intuitive. 3VLs (is a family of logics in reality) is a much more recent development (but I'm not sure when was first developed).

Side note: if someone will introduce Bottom, Unit and Option Types as attempts to justify SQL NULLs, I will be convinced only after a quite detailed examination that will shows of how SQL implementations with NULLs have a sound type system and will clarify, finally, what NULLs (these "values-not-quite-values") really are.


In what follow I will quote some authors. Any error or omission is probably mine and not of the original authors.

Joe Celko on SQL NULLs

I see Joe Celko often cited on this forum. Apparently he is a much respected author here. So, I said to myself: "what does he wrote about SQL NULLs? How does he explain NULLs numerous problems?". One of my friend has an ebook version of Joe Celko's SQL for smarties: advanced SQL programming, 3rd edition. Let's see.

First, the table of contents. The thing that strikes me most is the number of times that NULL is mentioned and in the most varied contexts:

3.4 Arithmetic and NULLs 109
3.5 Converting Values to and from NULL 110
3.5.1 NULLIF() Function 110
6 NULLs: Missing Data in SQL 185
6.4 Comparing NULLs 190
6.5 NULLs and Logic 190
6.5.1 NULLS in Subquery Predicates 191
6.5.2 Standard SQL Solutions 193
6.6 Math and NULLs 193
6.7 Functions and NULLs 193
6.8 NULLs and Host Languages 194
6.9 Design Advice for NULLs 195
6.9.1 Avoiding NULLs from the Host Programs 197
6.10 A Note on Multiple NULL Values 198
10.1 IS NULL Predicate 241
10.1.1 Sources of NULLs 242
...

and so on. It rings "nasty special case" to me.

I will go into some of these cases with excerpts from this book, trying to limit myself to the essential, for copyright reasons. I think these quotes fall within "fair use" doctrine and they can even stimulate to buy the book - so I hope that no one will complain (otherwise I will need to delete most of it, if not all). Furthermore, I shall refrain from reporting code snippets for the same reason. Sorry about that. Buy the book to read about datailed reasoning.

Page numbers between parenthesis in what follow.

NOT NULL Constraint (11)

The most important column constraint is the NOT NULL, which forbids the use of NULLs in a column. Use this constraint routinely, and remove it only when you have good reason. It will help you avoid the complications of NULL values when you make queries against the data.

It is not a value; it is a marker that holds a place where a value might go.

Again this "value but not quite a value" nonsense. The rest seems quite sensible to me.

(12)

In short, NULLs cause a lot of irregular features in SQL, which we will discuss later. Your best bet is just to memorize the situations and the rules for NULLs when you cannot avoid them.

Apropos of SQL, NULLs and infinite:

(104) CHAPTER 3: NUMERIC DATA IN SQL

SQL has not accepted the IEEE model for mathematics for several reasons.

...

If the IEEE rules for math were allowed in SQL, then we would need type conversion rules for infinite and a way to represent an infinite exact numeric value after the conversion. People have enough trouble with NULLs, so let’s not go there.

SQL implementations undecided on what NULL really means in particular contexts:

3.6.2 Exponential Functions (116)

The problem is that logarithms are undefined when (x <= 0). Some SQL implementations return an error message, some return a NULL and DB2/ 400; version 3 release 1 returned *NEGINF (short for “negative infinity”) as its result.

Joe Celko quoting David McGoveran and C. J. Date:

6 NULLs: Missing Data in SQL (185)

In their book A Guide to Sybase and SQL Server, David McGoveran and C. J. Date said: “It is this writer’s opinion than NULLs, at least as currently defined and implemented in SQL, are far more trouble than they are worth and should be avoided; they display very strange and inconsistent behavior and can be a rich source of error and confusion. (Please note that these comments and criticisms apply to any system that supports SQL-style NULLs, not just to SQL Server specifically.)”

NULLs as a drug addiction:

(186/187)

In the rest of this book, I will be urging you not to use them, which may seem contradictory, but it is not. Think of a NULL as a drug; use it properly and it works for you, but abuse it and it can ruin everything. Your best policy is to avoid NULLs when you can and use them properly when you have to.

My unique objection here is to "use them properly", which interacts badly with specific implementation behaviors.

6.5.1 NULLS in Subquery Predicates (191/192)

People forget that a subquery often hides a comparison with a NULL. Consider these two tables:

...

The result will be empty. This is counterintuitive, but correct.

(separator)

6.5.2 Standard SQL Solutions (193)

SQL-92 solved some of the 3VL (three-valued logic) problems by adding a new predicate of the form:

<search condition> IS [NOT] TRUE | FALSE | UNKNOWN

But UNKNOWN is a source of problems in itself, so that C. J. Date, in his book cited below, reccomends in chapter 4.5. Avoiding Nulls in SQL:

  • Don't use the keyword UNKNOWN in any context whatsoever.

Read "ASIDE" on UNKNOWN, also linked below.

6.8 NULLs and Host Languages (194)

However, you should know how NULLs are handled when they have to be passed to a host program. No standard host language for which an embedding is defined supports NULLs, which is another good reason to avoid using them in your database schema.

(separator)

6.9 Design Advice for NULLs (195)

It is a good idea to declare all your base tables with NOT NULL constraints on all columns whenever possible. NULLs confuse people who do not know SQL, and NULLs are expensive.

Objection: NULLs confuses even people that know SQL well, see below.

(195)

NULLs should be avoided in FOREIGN KEYs. SQL allows this “benefit of the doubt” relationship, but it can cause a loss of information in queries that involve joins. For example, given a part number code in Inventory that is referenced as a FOREIGN KEY by an Orders table, you will have problems getting a listing of the parts that have a NULL. This is a mandatory relationship; you cannot order a part that does not exist.

(separator)

6.9.1 Avoiding NULLs from the Host Programs (197)

You can avoid putting NULLs into the database from the Host Programs with some programming discipline.

...

  1. Determine impact of missing data on programming and reporting: Numeric columns with NULLs are a problem, because queries using aggregate functions can provide misleading results.

(separator)

(227)

The SUM() of an empty set is always NULL. One of the most common programming errors made when using this trick is to write a query that could return more than one row. If you did not think about it, you might have written the last example as: ...

(separator)

10.1.1 Sources of NULLs (242)

It is important to remember where NULLs can occur. They are more than just a possible value in a column. Aggregate functions on empty sets, OUTER JOINs, arithmetic expressions with NULLs, and OLAP operators all return NULLs. These constructs often show up as columns in VIEWs.

(separator)

(301)

Another problem with NULLs is found when you attempt to convert IN predicates to EXISTS predicates.

(separator)

16.3 The ALL Predicate and Extrema Functions (313)

It is counterintuitive at first that these two predicates are not the same in SQL:

...

But you have to remember the rules for the extrema functions—they drop out all the NULLs before returning the greater or least values. The ALL predicate does not drop NULLs, so you can get them in the results.

(separator)

(315)

However, the definition in the standard is worded in the negative, so that NULLs get the benefit of the doubt. ...

As you can see, it is a good idea to avoid NULLs in UNIQUE constraints.

Discussing GROUP BY:

NULLs are treated as if they were all equal to each other, and form their own group. Each group is then reduced to a single row in a new result table that replaces the old one.

This means that for GROUP BY clause NULL = NULL does not evaluate to NULL, as in 3VL, but it evaluate to TRUE.

SQL standard is confusing:

The ORDER BY and NULLs (329)

Whether a sort key value that is NULL is considered greater or less than a non-NULL value is implementation-defined, but...

... There are SQL products that do it either way.

In March 1999, Chris Farrar brought up a question from one of his developers that caused him to examine a part of the SQL Standard that I thought I understood. Chris found some differences between the general understanding and the actual wording of the specification.

And so on. I think is enough by Celko.

C. J. Date on SQL NULLs

C. J. Date is more radical about NULLs: avoid NULLs in SQL, period. In fact, chapter 4 of his SQL and Relational Theory: How to Write Accurate SQL Code is titled "NO DUPLICATES, NO NULLS", with subchapters "4.4 What's Wrong with Nulls?" and "4.5 Avoiding Nulls in SQL" (follow the link: thanks to Google Books, you can read some pages on-line).

Fabian Pascal on SQL NULLs

From its Practical Issues in Database Management - A Reference for the Thinking Practitioner (no excerpts on-line, sorry):

10.3 Pratical Implications

10.3.1 SQL NULLs

... SQL suffers from the problems inherent in 3VL as well as from many quirks, complications, counterintuitiveness, and outright errors [10, 11]; among them are the following:

  • Aggregate functions (e.g., SUM(), AVG()) ignore NULLs (except for COUNT()).
  • A scalar expression on a table without rows evaluates incorrectly to NULL, instead of 0.
  • The expression "NULL = NULL" evaluates to NULL, but is actually invalid in SQL; yet ORDER BY treats NULLs as equal (whatever they precede or follow "regular" values is left to DBMS vendor).
  • The expression "x IS NOT NULL" is not equal to "NOT(x IS NULL)", as is the case in 2VL.

...

All commercially implemented SQL dialects follow this 3VL approach, and, thus, not only do they exibits these problems, but they also have spefic implementation problems, which vary across products.

Finely answered 3/12, 2009 at 22:28 Comment(9)
"And this was my point: NULL, as value, is a "strange beast"." - that's because NULL is not a value.Yirinec
Also, SQL Server does not give (NULL = NULL) -> FALSE. To quote the documentation for ANSI_NULLS: "When ON is specified, all comparisons to a null value evaluate to UNKNOWN. When OFF is specified, comparisons of non-UNICODE values to a null value evaluate to TRUE if both values are NULL. "Yirinec
@Pavel Minaev: a) and how TRUE is better than FALSE? b) If it is not a value why is assigned as part of variable values?Finely
It's not really evaluated as TRUE. However, when a boolean expression is used in (e.g.) a WHERE clause of a SELECT, only rows for which the expression evaluates to TRUE are considered; FALSE and UNKNOWN rows are dropped from the result. The only other option, really, would be to return a single NULL from the entire SELECT (meaning "it is not known which rows satisfy this"), which is not very practical. In a sense, you could say that WHERE condition should really read WHERE condition FOR SURE.Yirinec
Sorry, I apparently replied to a wrong thing. TRUE is not better (it is in fact wrong), but MSSQL will only give TRUE when ANSI_NULLS are off. In that mode, it simply treats NULL as a value. Everything I said elsewhere was with respect to` ANSI_NULLS ON` - the standard-compliant, 3VL behavior.Yirinec
@Pavel Minaev: no problem, a minor quibble.Finely
>>Since it's December, let's use a seasonal example. I have two presents under the tree. Now, you tell me if I got two of the same thing or not. ..... yes, you did inasmuch as you got two things and as far as you are concerned right now, to the extent of your current knowledge, they are exactly the same to youCountryandwestern
TL;DR - NULL = NULL evaluates to false.Sweetandsour
null = null should be true. null is well-defined value which may represent an unknown value, but it may also represent the absence of a value. It should be up to the developer to decide what null represents, but null itself is absolutely a value and null is null = null. Any other implementation is bound for disaster, because you're interjecting ternary logic into predicates that are fundamentally Boolean. I'm APPALLED that this is becoming a permanent on setting in SQL server. OFF OFF OFF with it.Tuneless
S
17

The answers here all seem to come from a CS perspective so I want to add one from a developer perspective.

For a developer NULL is very useful. The answers here say NULL means unknown, and maybe in CS theory that's true, don't remember, it's been a while. In actual development though, at least in my experience, that happens about 1% of the time. The other 99% it is used for cases where the value is not UNKNOWN but it is KNOWN TO BE ABSENT.

For example:

  • When using soft deletes, deletedAt = NULL doesn't mean we don't know when it was deleted, it means it wasn't deleted.

  • Client.LastPurchase, for a new client. It is not unknown, it is known that he hasn't made a purchase yet.

  • When using an ORM with a Table per Class Hierarchy mapping, some values are just not mapped for certain classes.

  • When mapping a tree structure a root will usually have Parent = NULL

  • And many more...

I'm sure most developers at some point wrote WHERE value = NULL, didn't get any results, and that's how they learned about IS NULL syntax. Just look how many votes this question and the linked ones have.

SQL Databases are a tool, and they should be designed the way which is easiest for their users to understand.

Starchy answered 23/8, 2018 at 20:40 Comment(1)
Everybody seems to shout “NULL is unknown” and then justify the behaviour. Yes, if that is a premise then 3VL maybe the answer. But in almost all DBs I work on, NULL means absent. Sorry your voice is lost in the wilderness @StarchyAmey
E
11

Just because you don't know what two things are, does not mean they're equal. If when you think of NULL you think of “NULL” (string) then you probably want a different test of equality like Postgresql's IS DISTINCT FROM AND IS NOT DISTINCT FROM

From the PostgreSQL docs on "Comparison Functions and Operators"

expression IS DISTINCT FROM expression

expression IS NOT DISTINCT FROM expression

For non-null inputs, IS DISTINCT FROM is the same as the <> operator. However, if both inputs are null it returns false, and if only one input is null it returns true. Similarly, IS NOT DISTINCT FROM is identical to = for non-null inputs, but it returns true when both inputs are null, and false when only one input is null. Thus, these constructs effectively act as though null were a normal data value, rather than "unknown".

Exquisite answered 3/12, 2009 at 22:29 Comment(0)
H
9

Maybe it depends, but I thought NULL=NULL evaluates to NULL like most operations with NULL as an operand.

Hunley answered 3/12, 2009 at 22:30 Comment(0)
M
7

MSDN has a nice descriptive article on nulls and the three state logic that they engender.

In short, the SQL92 spec defines NULL as unknown, and NULL used in the following operators causes unexpected results for the uninitiated:

= operator NULL   true   false 
NULL       NULL   NULL   NULL
true       NULL   true   false
false      NULL   false  true

and op     NULL   true   false 
NULL       NULL   NULL   false
true       NULL   true   false
false      false  false  false

or op      NULL   true   false 
NULL       NULL   true   NULL
true       true   true   true
false      NULL   true   false
Mediatory answered 3/12, 2009 at 22:47 Comment(2)
But the question is not about 3VL (three-valued logic) is about the reflexive property of equality.Finely
To be more precise, as I finally detailed in my answer, problems arise when equality is interpreted in a 3VL so that reflexive property of equality does not always evaluate to true.Finely
F
7

At technet there is a good explanation for how null values work.

Null means unknown.

Therefore the Boolean expression

value=null

does not evaluate to false, it evaluates to null, but if that is the final result of a where clause, then nothing is returned. That is a practical way to do it, since returning null would be difficult to conceive.

It is interesting and very important to understand the following:

If in a query we have

where (value=@param Or @param is null) And id=@anotherParam

and

  • value=1
  • @param is null
  • id=123
  • @anotherParam=123

then

"value=@param" evaluates to null
"@param is null" evaluates to true
"id=@anotherParam" evaluates to true

So the expression to be evaluated becomes

(null Or true) And true

We might be tempted to think that here "null Or true" will be evaluated to null and thus the whole expression becomes null and the row will not be returned.

This is not so. Why?

Because "null Or true" evaluates to true, which is very logical, since if one operand is true with the Or-operator, then no matter the value of the other operand, the operation will return true. Thus it does not matter that the other operand is unknown (null).

So we finally have true=true and thus the row will be returned.

Note: with the same crystal clear logic that "null Or true" evaluates to true, "null And true" evaluates to null.

Update:
Ok, just to make it complete I want to add the rest here too which turns out quite fun in relation to the above.

"null Or false" evaluates to null, "null And false" evaluates to false. :)

The logic is of course still as self-evident as before.

Fourgon answered 5/2, 2014 at 23:8 Comment(0)
M
5

The concept of NULL is questionable, to say the least. Codd introduced the relational model and the concept of NULL in context (and went on to propose more than one kind of NULL!) However, relational theory has evolved since Codd's original writings: some of his proposals have since been dropped (e.g. primary key) and others never caught on (e.g. theta operators). In modern relational theory (truly relational theory, I should stress) NULL simply does not exist. See The Third Manifesto. http://www.thethirdmanifesto.com/

The SQL language suffers the problem of backwards compatibility. NULL found its way into SQL and we are stuck with it. Arguably, the implementation of NULL in SQL is flawed (SQL Server's implementation makes things even more complicated due to its ANSI_NULLS option).

I recommend avoiding the use of NULLable columns in base tables.


Although perhaps I shouldn't be tempted, I just wanted to assert a corrections of my own about how NULL works in SQL:

NULL = NULL evaluates to UNKNOWN.

UNKNOWN is a logical value.

NULL is a data value.

This is easy to prove e.g.

SELECT NULL = NULL

correctly generates an error in SQL Server. If the result was a data value then we would expect to see NULL, as some answers here (wrongly) suggest we would.

The logical value UNKNOWN is treated differently in SQL DML and SQL DDL respectively.

In SQL DML, UNKNOWN causes rows to be removed from the resultset.

For example:

CREATE TABLE MyTable
(
 key_col INTEGER NOT NULL UNIQUE, 
 data_col INTEGER
 CHECK (data_col = 55)
);

INSERT INTO MyTable (key_col, data_col)
   VALUES (1, NULL);

The INSERT succeeds for this row, even though the CHECK condition resolves to NULL = NULL. This is due defined in the SQL-92 ("ANSI") Standard:

11.6 table constraint definition

3)

If the table constraint is a check constraint definition, then let SC be the search condition immediately contained in the check constraint definition and let T be the table name included in the corresponding table constraint descriptor; the table constraint is not satisfied if and only if

EXISTS ( SELECT * FROM T WHERE NOT ( SC ) )

is true.

Read that again carefully, following the logic.

In plain English, our new row above is given the 'benefit of the doubt' about being UNKNOWN and allowed to pass.

In SQL DML, the rule for the WHERE clause is much easier to follow:

The search condition is applied to each row of T. The result of the where clause is a table of those rows of T for which the result of the search condition is true.

In plain English, rows that evaluate to UNKNOWN are removed from the resultset.

Mucilaginous answered 7/10, 2010 at 12:41 Comment(0)
R
5

Because NULL means 'unknown value' and two unknown values cannot be equal.

So, if to our logic NULL N°1 is equal to NULL N°2, then we have to tell that somehow:

SELECT 1
WHERE ISNULL(nullParam1, -1) = ISNULL(nullParam2, -1)

where known value -1 N°1 is equal to -1 N°2

Ra answered 17/5, 2012 at 13:45 Comment(1)
nullParam1 = -1 and nullParam2 =NULL and airplain crash .... should be ISNULL(NULLIF(@nullParam1, @nullParam2), NULLIF(@nullParam2, nullParam1)) IS NULLMicroclimatology
I
4

There are two sensible ways to handle NULL = NULL comparisons in a WHERE clause, and they boil down to "What do you mean by NULL?" One way assumes NULL means "unknown," and the other assumes NULL means "data does not exist." SQL has chosen a third way which is wrong all around.

The "NULL means unknown" solution: Throw an error.

Unknown = unknown should evaluate to 3VL null. But the output of a WHERE clause is 2VL: You either return the row or you don't. It's like being asked to divide by zero and return a number: There is no correct response. So you throw an error instead, and force the programmer to explicitly handle this situation.

The "NULL means no data" solution: Return the row.

No data = no data should evaluate to true. If I'm comparing two people, and they have the same first name, and the same last name, and neither has a middle name, then it is correct to say "These people have the same name."

The SQL solution: Don't return the row.

This is always wrong. If NULL means "unknown," then you don't know if the row should be returned or not, and you should not try to guess. If NULL means "no data," then you should return the row. Either way, silently removing the row is incorrect and will cause problems. It's the worst of both worlds.

Setting aside theory and speaking in practical terms, I'm with AlexDev: I have almost never encountered a case where "return the row" was not the desired result. However, "almost never" is not "never," and SQL databases often serve as the backbones of big important systems, so I can see a fair case for being rigorous and throwing an error.

What I cannot see is a case for silently coercing 3VL null into 2VL false. Like most silent type coercions, it's a rabid weasel waiting to be set loose in your system, and when the weasel finally jumps out and bites someone, you'll have the merry devil of a time tracking it back to its nest.

Implicate answered 11/4, 2021 at 19:45 Comment(0)
A
3

NULL isn't equal to anything, not even itself. My personal solution to understanding the behavior of NULL is to avoid using it as much as possible :).

Austere answered 3/12, 2009 at 22:32 Comment(7)
might as well be equal to everything, as it is in the case of left/right/outer joins...Emanuele
What a silly unproductive answer. The same could said to elementary kids about algebra, but without actually acknowledging what is trying to solve it would just come off as silly, which it did.Exquisite
@Evan: Actually, avoiding NULL is a sound solution. 3-valued logic is not uncontroversial, and many people feel that SQL would be better off without NULL and all the (necessary) complexity it entails.Cryptogam
"Many people" is a weasel word, and "not uncontroversial" is a way to cloak the more simple "controversial" of which 3VL is not.Exquisite
"NULL isn't equal to anything, not even itself." going by that logic, <somevalue> != NULL should return true. In SQL's weird universe, however, it's false.Dichotomous
@TomLint it's not equal or unequal to anything. it can't be compared, only checkedRingtail
"it can't be compared" it can, it's just that some person with no semblance of sanity decided otherwise.Dichotomous
G
3

The question:
Does one unknown equal another unknown?
(NULL = NULL)
That question is something no one can answer so it defaults to true or false depending on your ansi_nulls setting.

However the question:
Is this unknown variable unknown?
This question is quite different and can be answered with true.

nullVariable = null is comparing the values
nullVariable is null is comparing the state of the variable

Gregg answered 3/12, 2009 at 23:52 Comment(0)
B
3

The confusion arises from the level of indirection (abstraction) that comes about from using NULL.

Going back to the "what's under the Christmas tree" analogy, "Unknown" describes the state of knowledge about what is in Box A.

So if you don't know what's in Box A, you say it's "Unknown", but that doesn't mean that "Unknown" is inside the box. Something other than unknown is in the box, possibly some kind of object, or possibly nothing is in the box.

Similarly, if you don't know what's in Box B, you can label your state of knowledge about the contents as being "Unknown".

So here's the kicker: Your state of knowledge about Box A is equal to your state of knowledge about Box B. (Your state of knowledge in both cases is "Unknown" or "I don't know what's in the Box".) But the contents of the boxes may or may not be equal.

Going back to SQL, ideally you should only be able to compare values when you know what they are. Unfortunately, the label that describes a lack of knowledge is stored in the cell itself, so we're tempted to use it as a value. But we should not use that as a value, because it would lead to "the content of Box A equals the content of Box B when we don't know what's in Box A and/or we don't know what's in Box B. (Logically, the implication "if I don't know what's in Box A and if I don't know what's in Box B, then what's in Box A = What's in Box B" is false.)

Yay, Dead Horse.

Barytone answered 27/3, 2014 at 23:4 Comment(0)
J
1

null is unknown in sql so we cant expect two unknowns to be same.

However you can get that behavior by setting ANSI_NULLS to Off(its On by Default) You will be able to use = operator for nulls

SET ANSI_NULLS off
if null=null
print 1
else 
print 2
set ansi_nulls on
if null=null
print 1
else 
print 2
Jacobus answered 3/12, 2009 at 22:38 Comment(2)
This is all kinds of no. The world has a definition of null, learn to understand it or just alter the table to have int types and update the columns.Exquisite
I really did not recommend SET ANSI_NULLS off. I found out about ANSI_NULLS the hard way. But it is always good to know all the options available specially when you come across a line which says Where SomeId=null How would you make sense of that line without knowing about ANSI_NULLS. The way i look it, my post was useful..:)Jacobus
P
1

You work for the government registering information about citizens. This includes the national ID for every person in the country. A child was left at the door of a church some 40 years ago, nobody knows who their parents are. This person's father ID is NULL. Two such people exist. Count people who share the same father ID with at least one other person (people who are siblings). Do you count those two too?

The answer is no, you don’t, because we don’t know if they are siblings or not.

Suppose you don’t have a NULL option, and instead use some pre-determined value to represent “the unknown”, perhaps an empty string or the number 0 or a * character, etc. Then you would have in your queries that * = *, 0 = 0, and “” = “”, etc. This is not what you want (as per the example above), and as you might often forget about these cases (the example above is a clear fringe case outside ordinary everyday thinking), then you need the language to remember for you that NULL = NULL is not true.

Necessity is the mother of invention.

Paolo answered 10/7, 2019 at 10:56 Comment(0)
M
1

To quote the Christmas analogy again:

In SQL, NULL basically means "closed box" (unknown). So, the result of comparing two closed boxes will also be unknown (null).

I understand, for a developer, this is counter-intuitive, because in programming languages, often NULL rather means "empty box" (known). And comparing two empty boxes will naturally yield true / equal.

This is why JavaScript for example distinguishes between null and undefined.

Marmion answered 5/11, 2020 at 3:42 Comment(0)
F
0

Just an addition to other wonderful answers:

AND: The result of true and unknown is unknown, false and unknown is false,
while unknown and unknown is unknown.

OR: The result of true or unknown is true, false or unknown is unknown, while unknown or unknown is unknown.

NOT: The result of not unknown is unknown
Follicle answered 14/4, 2015 at 9:55 Comment(0)
N
0

If you are looking for an expression returning true for two NULLs you can use:

SELECT 1 
WHERE EXISTS (
    SELECT NULL
    INTERSECT
    SELECT NULL
)

It is helpful if you want to replicate data from one table to another.

Naphthalene answered 12/4, 2019 at 22:48 Comment(0)
C
0

The equality test, for example, in a case statement when clause, can be changed from

XYZ = NULL 

to

XYZ IS NULL

If I want to treat blanks and empty string as equal to NULL I often also use an equality test like:

(NULLIF(ltrim( XYZ ),'') IS NULL)
Coyote answered 12/12, 2019 at 0:57 Comment(0)
B
0

Null isn't equal to anything including itself Best way to test if an object is null is to check whether the object equals itself since null is the only object not equal to itself

const obj = null

console.log(obj==obj) //false, then it's null

Check this article

Biliary answered 25/10, 2021 at 12:43 Comment(1)
Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.Ivonne

© 2022 - 2024 — McMap. All rights reserved.