Is there any difference between CROSS APPLY and OUTER APPLY when creating a cartesian product?
Asked Answered
P

2

6

When creating a cartesian product between two tables, is there any difference between CROSS APPLY and OUTER APPLY?

This may seem like a silly question given that without a relationship expressed between the tables, the right-hand table can't fail to satisfy the relation, but I'm respectful of what I don't know.

When I look at the execution plans with a simple test setup, they're identical [two index seeks feeding into Nested Loops (Inner Join)], but simple test setups can be deceptive.

Here's an example of what I mean (SQL Fiddle). The setup:

CREATE TABLE dbo.First (
    Id      INT IDENTITY(1, 1) PRIMARY KEY,
    Name    NVARCHAR(100)
);
GO
DECLARE @n INT = 1;
WHILE @n < 10000
BEGIN
    INSERT INTO dbo.First (Name) VALUES ('First' + CONVERT(NVARCHAR(100), @n));
    SET @n = @n + 1;
END
GO
CREATE INDEX IX__First__Name ON dbo.First(Name);
GO
CREATE TABLE dbo.Second (
    Id      INT IDENTITY(1, 1) PRIMARY KEY,
    Name    NVARCHAR(100)
);
GO
DECLARE @n INT = 1;
WHILE @n < 10000
BEGIN
    INSERT INTO dbo.Second (Name) VALUES ('Second' + CONVERT(NVARCHAR(100), @n));
    SET @n = @n + 1;
END
GO
CREATE INDEX IX__Second__Name ON dbo.Second(Name);
GO

Using CROSS APPLY:

SELECT      First.Id AS FirstId, Second.Id AS SecondId
FROM        First
CROSS APPLY Second
WHERE       First.Name IN ('First253', 'First3304')
AND         Second.Name IN ('Second6543', 'Second517');

Using OUTER APPLY:

SELECT      First.Id AS FirstId, Second.Id AS SecondId
FROM        First
OUTER APPLY Second                                 -- <== Only change is here
WHERE       First.Name IN ('First253', 'First3304')
AND         Second.Name IN ('Second6543', 'Second517');

...both of which give me the expected four rows.

Plus various variations where either, or both, IN clauses return no matches:

-- No match in First
SELECT      First.Id AS FirstId, Second.Id AS SecondId
FROM        First
CROSS APPLY Second
WHERE       First.Name IN ('no match')
AND         Second.Name IN ('Second6543', 'Second517');

SELECT      First.Id AS FirstId, Second.Id AS SecondId
FROM        First
OUTER APPLY Second
WHERE       First.Name IN ('no match')
AND         Second.Name IN ('Second6543', 'Second517');

-- No match in Second
SELECT      First.Id AS FirstId, Second.Id AS SecondId
FROM        First
CROSS APPLY Second
WHERE       First.Name IN ('First253', 'First3304')
AND         Second.Name IN ('no match');

SELECT      First.Id AS FirstId, Second.Id AS SecondId
FROM        First
OUTER APPLY Second
WHERE       First.Name IN ('First253', 'First3304')
AND         Second.Name IN ('no match');

-- No match in either
SELECT      First.Id AS FirstId, Second.Id AS SecondId
FROM        First
CROSS APPLY Second
WHERE       First.Name IN ('no match')
AND         Second.Name IN ('no match');

SELECT      First.Id AS FirstId, Second.Id AS SecondId
FROM        First
OUTER APPLY Second
WHERE       First.Name IN ('no match')
AND         Second.Name IN ('no match');

...all of which give me the expected zero rows.

Primine answered 26/1, 2018 at 11:5 Comment(3)
I give it high odds that the answer will require the use of <!-- a filler comment or similar --> to meet the minimum character length, "no" being too short... ;-)Primine
SELECT * FROM #A CROSS APPLY #A, SELECT * FROM #A OUTER APPLY #A and SELECT * FROM #A CROSS JOIN #A are all synonymous. The different syntaxes will yield no benefits (or detriments).Pudens
With a condition like here OUTER APPLY Second WHERE First.Name IN ('First253', 'First3304') AND Second.Name IN ('no match'); you are out-tricking the OUTER...Counteroffensive
C
4

The difference comes into play when applied table or table-valued function has no records:

SELECT      First.Id AS FirstId, Second.Id AS SecondId
FROM        First
OUTER APPLY (SELECT * FROM Second WHERE Second.Id = -1) Second
WHERE       First.Name IN ('First253', 'First3304');

2 rows returned


SELECT      First.Id AS FirstId, Second.Id AS SecondId
FROM        First
CROSS APPLY (SELECT * FROM Second WHERE Second.Id = -1) Second
WHERE       First.Name IN ('First253', 'First3304');

0 rows returned

In OP's own words:

Not the way you're doing it, because conceptually you're filtering with WHERE after the APPLY (although the plans show the engine optimizing by doing it first); but if you explicitly filter first and then APPLY like this:

SELECT      First.Id AS FirstId, FilteredSecond.Id AS SecondId
FROM        First
CROSS APPLY (SELECT Id FROM Second WHERE Name IN ('xxx')) FilteredSecond 
WHERE       First.Name IN ('First253', 'First3304');

you'd see the difference because you'd get rows with NULLs with the OUTER but no rows with the CROSS.

Cleanshaven answered 26/1, 2018 at 11:26 Comment(11)
But that's not a cartesian product anymore...?Primine
OK. Cartesian product of a table with an empty table would be empty for cross apply and would contain records from the first table with outer apply.Cleanshaven
In your examples you first create a cartesian product and then filter it through WHERE. Try leaving the Second table empty altogether and do an unconditional cartesian product.Cleanshaven
More on cartesian products: ∅×B, A×∅, ∅×∅. Theoretically, what OUTER APPLY does is not a cartesian product.Cleanshaven
@Y.B True for the last comment. But the engine will decide in many (fairly simple) cases to deal with it exactly the same way as with a JOIN. Therefore - practically - it is the same very often :-D (you knew that assumably :-D +1 from my side)Counteroffensive
So your answer is "Not the way you're doing it, because conceptually you're applying the filtering with WHERE after the APPLY (although the plans show the engine optimizing by doing it first); but if you explicitly filter first and then APPLY (like this: SELECT First.Id AS FirstId, FilteredSecond.Id AS SecondId FROM First CROSS APPLY (SELECT Id FROM Second WHERE Name IN ('xxx')) FilteredSecond WHERE First.Name IN ('First253', 'First3304');, you'd see the difference because you'd get rows with NULLs with the OUTER but no rows with the CROSS." Is that basically what you're saying?Primine
Obviously I know that reorganizing like that would result in rows with NULLs, but the main point being "Not the way you're doing it, because conceptually you're applying the filtering after the product."Primine
(With an OUTER I mean.)Primine
@T.J.Crowder Yep. :-)Cleanshaven
Cool, let's move that into the answer and I'll have a think. But I think that's probably the answer.Primine
@T.J.Crowder That's exactly what I've meant with you are out-tricking the OUTERCounteroffensive
C
2

Think of CROSS APPLY as related to an INNER JOIN and OUTER APPLY as related to a LEFT JOIN.

  • CROSS / INNER will limit the result to rows from both sources, while
  • OUTER / LEFT will return all rows of the first table and NULLs in case of no related row in the second source.

The difference is that JOINs link two resultset over a relation condition, while APPLY is called row-wise with values of a current row.

You can use APPLY to create computed values out of row values or (which is the main purpose) to call a TVF with row-values as parameters. Often you can see APPLY in connection with XMLTypedVariable.nodes().

Some thoughts about execution In simple cases (like above) the engine will decide to walk the same paths. But with more complex scenarios the differences can be huge.

Counteroffensive answered 26/1, 2018 at 11:26 Comment(3)
That's all sort of high-level description stuff, which is fine, but I'm not seeing how it addresses the question...?Primine
@T.J.Crowder I see your point, but - as the OP uses conditions (WHERE clauses) the cartesian product might not be an each-with-each in any case. With a more complex query APPLY and JOIN will be handled differently. But it's hard to forecast, what the engine decides to be best...Counteroffensive
@T.J.Crowder, sorry, did not realise that you are the "OP" :-DCounteroffensive

© 2022 - 2024 — McMap. All rights reserved.