SQL Inner Join On Null Values
Asked Answered
T

8

49

I have a Join

SELECT * FROM Y
INNER JOIN X ON ISNULL(X.QID, 0) = ISNULL(y.QID, 0) 

Isnull in a Join like this makes it slow. It's like having a conditional Join. Is there any work around to something like this? I have a lot of records where QID is Null

Anyone have a work around that doesn't entail modifying the data

Tsarevna answered 4/2, 2010 at 18:32 Comment(11)
For what database (including version)?Beatitude
Only SQL Server and Access have ISNULL(), so I'd assume SQL ServerNonalignment
Is it safe to assume that in both X and Y QID could be any value including NULL and 0, but that NULL and 0 are to be considered equal?Prink
Did you mean to have X.QID = Y.QID before the AND?Eleonoreeleoptene
MySQL has an entirely different ISNULL function; the syntax used here is usually spelled COALESCE. Then again, MySQL also has the <=> operator, which won't help for this question.Reply
Does the QID column contain lots of null in both tables? If so you effectively will get a crossjoin on the null columns which seems like a funny result.Deluna
I would be interested in seeing a query plan, I think the comments are just over analyzing what is probably happening. The ISNULL() is effectively byassing the index. This is a fairly novice query written poorly: Not hard to fix, but it just bit him. There is construct for this I've given both of them in my answer.Scale
yes qid does contain lots of nulls and 0 and Null are meant to be equalTsarevna
I am sorry there are other parts to this join that would make it not a cross join. This is a small part of a huge beast of a view.Tsarevna
Sounds like you ought to be storing zeros and not nulls in the field. Logically Null is not equal to Null in databases.Full
you may be correct in that, but thats not really an option at this point.Tsarevna
S
71

You have two options

INNER JOIN x
   ON x.qid = y.qid OR (x.qid IS NULL AND y.qid IS NULL)

or easier

INNER JOIN x
  ON x.qid IS NOT DISTINCT FROM y.qid
Scale answered 4/2, 2010 at 18:42 Comment(12)
Certainly, the IS NOT DISTINCT FROM can not be slower, so I would go with it. But neither of these should be /slow/. You would have to be talking about a fairly massive data set for it to matter.Scale
hmm evan as you can probably understand this is a smaller version of a much larger view. As you stated it what is shown above is a novice problem that has gotten away from me. While i don't want to air my view all over the place do you have a minute to look at it maybe and help me patch the wound...Tsarevna
sure you can contact me on irc. EvanCarroll on freenode.Scale
Am I missing something, or is the "IS NOT DISTINCT FROM" feature not available in SQL Server? From what I can find, it will likely make it into the next release, but it's not in SQL Server 2008.Mullein
@Garland yes, there is no IS NOT DISTINCT FROM in SQL ServerSurge
-1 Then why is this the voted and most rated answer? The first option doesn't appear to work either as I get "The multi-part identifier "dbo.tbl_MyTable" could not be bound."Godhood
Snap! Telling it like it isKlockau
What is the meaning of below query( b.field is null??) select * from table1 a left join table 2 b on a.id = b.id and b.field is null b.field is null what it will do in join conditionCoherence
IS NULL x evaluates to true if and only if x is null. true is a value like all others. the trues on the left and the true on the right will be cross-joined.Scale
Greetings from the year 2022: I'm pleased to say that SQL Server 2022 now finally supports IS [NOT] DISTINCT FROM: learn.microsoft.com/en-gb/sql/t-sql/queries/…Heriot
@Heriot If you want to know where SQL Server will be going in the next 20 years, you can download the beta version today at postgresql.org they just have to add telemetry/spyware and work C# into it.Scale
@EvanCarroll har-har - yes, Postgres is probably the best overall RDBMS right now, but there's a lot of features that MSSQL Server has that Postgres stilll lacks (at least until Postgres 15 comes out), like MERGE and indexed-views (Postgres' Materialized Views require manual updates, SQL Server's don't).Heriot
E
11

If you want null values to be included from Y.QID then Fastest way is

SELECT * FROM Y LEFT JOIN X ON y.QID = X.QID

Note: this solution is applicable only if you need null values from Left table i.e. Y (in above case).

Otherwise INNER JOIN x ON x.qid IS NOT DISTINCT FROM y.qid is right way to do

Escarpment answered 24/7, 2014 at 4:34 Comment(0)
H
9

This article has a good discussion on this issue. You can use

SELECT * 
FROM Y
INNER JOIN X ON EXISTS(SELECT X.QID 
                       INTERSECT 
                       SELECT y.QID);
Humankind answered 17/9, 2016 at 22:26 Comment(6)
Any comparison on query plans on this versus ISNULL()? I couldn't see much difference on a sample data set, but I like the simplicity for multicolumn joins.Morrell
@ThomasG.Mayfield - ISNULL kills index use. e.g. compare the plans for CREATE TABLE #T(X INT UNIQUE); SELECT * FROM #T T1 INNER MERGE JOIN #T T2 ON EXISTS(SELECT T1.X INTERSECT SELECT T2.X);SELECT * FROM #T T1 INNER MERGE JOIN #T T2 ON ISNULL(T1.X,-1) = ISNULL(T2.X,-1);SELECT * FROM #T T1 INNER LOOP JOIN #T T2 ON EXISTS(SELECT T1.X INTERSECT SELECT T2.X);SELECT * FROM #T T1 INNER LOOP JOIN #T T2 ON ISNULL(T1.X,-1) = ISNULL(T2.X,-1);DROP TABLE #T - the merge version can't use the index order and needs a sort and the nested loops scans the inner input rather than seeking it.Humankind
I´ve used this intersect trick all the time, but it seems that SQL Server 2016 optimizer doesn´t like it much. I have to investigate it more though,Lemonade
@RomanPekar - If you find an example where it is optimised badly I'd be interested to see itHumankind
@MartinSmith if I manage to find some time to make a good example, I'm going to create a question on SO.Lemonade
@RomanPekar - sometimes using NOT EXISTS and EXCEPT is optimised better than this version. Although sometimes it is just best to write it out the long winded way. X.QID = Y.QID OR (X.QID IS NULL AND Y.QID IS NULL)Humankind
F
6

Are you committed to using the Inner join syntax?

If not you could use this alternative syntax:

SELECT * 
FROM Y,X
WHERE (X.QID=Y.QID) or (X.QUID is null and Y.QUID is null)
Full answered 4/2, 2010 at 19:17 Comment(1)
There's no benefit over using this syntax rather than the explicit join in the accepted answer though.Humankind
A
3

I'm pretty sure that the join doesn't even do what you want. If there are 100 records in table a with a null qid and 100 records in table b with a null qid, then the join as written should make a cross join and give 10,000 results for those records. If you look at the following code and run the examples, I think that the last one is probably more the result set you intended:

create table #test1 (id int identity, qid int)
create table #test2 (id int identity, qid int)

Insert #test1 (qid)
select null
union all
select null
union all
select 1
union all
select 2
union all
select null

Insert #test2 (qid)
select null
union all
select null
union all
select 1
union all
select 3
union all
select null


select * from #test2 t2
join #test1 t1 on t2.qid = t1.qid

select * from #test2 t2
join #test1 t1 on isnull(t2.qid, 0) = isnull(t1.qid, 0)


select * from #test2 t2
join #test1 t1 on 
 t1.qid = t2.qid OR ( t1.qid IS NULL AND t2.qid IS NULL )


select t2.id, t2.qid, t1.id, t1.qid from #test2 t2
join #test1 t1 on t2.qid = t1.qid
union all
select null, null,id, qid from #test1 where qid is null
union all
select id, qid, null, null from #test2  where qid is null
Alehouse answered 4/2, 2010 at 19:17 Comment(2)
My Current Join does what I want it just does it inneficiently.Tsarevna
Out of curiosity why do you want a cross join on the null ids?Alehouse
S
1

Hey it is kind of late to answer that but I got the same question, what I realized is that you must have a record with the ID of 0 in you second table to make this :

SELECT * FROM Y
INNER JOIN X ON ISNULL(Y.QID, 0) = ISNULL(X.QID, 0)

to happen, it actually says if there is none, then use 0. BUT what if Y table does NOT have a record with the ID of 0?

So, I found this method, (and worked for my case):

SELECT 
  ISNULL(Y.QName, 'ThereIsNone') AS YTableQName
FROM
  X
LEFT OUTER JOIN Y ON X.QID = Y.QID

A snapshot of my case

This way you DON'T need a record with 0 ID value in your second table (which is Y in this case and Customers in my case), OR any record at all

UPDATE:

You can also take a look at this post for better understanding.

Sulphathiazole answered 22/12, 2022 at 16:2 Comment(1)
your first thing seems to be doing fine for me. I dont get the problem you have with it. I want to join if both are null, so both will be 0. I dont need one of the tables to have a 0 alreadyAlter
E
-1

Basically you want to join two tables together where their QID columns are both not null, correct? However, you aren't enforcing any other conditions, such as that the two QID values (which seems strange to me, but ok). Something as simple as the following (tested in MySQL) seems to do what you want:

SELECT * FROM `Y` INNER JOIN `X` ON (`Y`.`QID` IS NOT NULL AND `X`.`QID` IS NOT NULL);

This gives you every non-null row in Y joined to every non-null row in X.

Update: Rico says he also wants the rows with NULL values, why not just:

SELECT * FROM `Y` INNER JOIN `X`;
Eleonoreeleoptene answered 4/2, 2010 at 18:52 Comment(2)
That is incorrect.. if is ay were y.qid = x.qid it will give me only rows that are not null.. I want the rows where they are both null as welll.Tsarevna
let me see if i can come up with a more precise example to clear the issuesTsarevna
P
-1

You could also use the coalesce function. I tested this in PostgreSQL, but it should also work for MySQL or MS SQL server.

INNER JOIN x ON coalesce(x.qid, -1) = coalesce(y.qid, -1)

This will replace NULL with -1 before evaluating it. Hence there must be no -1 in qid.

Peterman answered 11/5, 2016 at 7:5 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.