INNER JOIN vs INNER JOIN (SELECT . FROM)
Asked Answered
M

5

73

Is there any difference in terms of performance between these two versions of the same query?

--Version 1
SELECT p.Name, s.OrderQty
FROM Product p
INNER JOIN SalesOrderDetail s on p.ProductID = s.ProductID

--Version 2
SELECT p.Name, s.OrderQty
FROM Product p
INNER JOIN (SELECT ProductID, OrderQty FROM SalesOrderDetail) s on p.ProductID = s.ProductID

I've heard it said (by a DBA) that Version 2 is faster because it fetches, within the inner SELECT statement, only the columns that are required for the query. But that doesn't seem to make sense, since query performance (as I know) is based on number of rows affected and final list of columns returned.

The query plans for both are identical, so I'm guessing there isn't any difference between the two.

Am I correct?

Muzzleloader answered 20/7, 2012 at 7:3 Comment(5)
Fire that DBA...Yoga
Of course, he is not right. The only explanation I have that he has this opinion since some older version of sql-server or some SP or CU or maybe even another server.Abyssal
Actually, in MySql this can be true (the subselect can radically improve performance by enforcing access order). Scary, I know. And why I will avoid MySql like The Plague forevermore.Poeticize
@ErikE: Hmm. MySql does always come up as an odd one out while discussing SQL, doesn't it? :-)Muzzleloader
According to EXPLAIN of PostgreSQL 9.5.6, INNER JOIN table will fetch all columns, while INNER JOIN (SELECT column1, column2, ...) will only fetch specified columns and has less data width than first one. But this only applies to PostgreSQL 9.5.6 and not yours. Fire that DBA :PPaolapaolina
P
53

You are correct. You did exactly the right thing, checking the query plan rather than trying to second-guess the optimiser. :-)

Passing answered 20/7, 2012 at 7:7 Comment(0)
M
18

There won't be much difference. Howver version 2 is easier when you have some calculations, aggregations, etc that should be joined outside of it

--Version 2 
SELECT p.Name, s.OrderQty 
FROM Product p 
INNER JOIN 
(SELECT ProductID, SUM(OrderQty) as OrderQty FROM SalesOrderDetail GROUP BY ProductID
HAVING SUM(OrderQty) >1000) s 
on p.ProductID = s.ProdctId 
Motion answered 20/7, 2012 at 7:11 Comment(2)
Yeah, I get that when calculations or aggregations are involved this approach is easier, but my question was specific to the case I presented. Thanks!Muzzleloader
Can you explain me, will a performance issue if i use: SELECT p.Name, s.OrderQty FROM Product p INNER JOIN SalesOrderDetail s on p.ProductID = s.ProductID GROUP BY p.Name HAVING SUM(s.OrderQty) > 1000; Thanks in advance!Tanana
K
1

Seems to be identical just in case that SQL server will not try to read data which is not required for the query, the optimizer is clever enough

It can have sense when join on complex query (i.e which have joings, groupings etc itself) then, yes, it is better to specify required fields.

But there is one more point. If the query is simple there is no difference but EVERY extra action even which is supposed to improve performance makes optimizer works harder and optimizer can fail to get the best plan in time and will run not optimal query. So extras select can be a such action which can even decrease performance

Kraft answered 20/7, 2012 at 7:11 Comment(0)
P
0

Even though for smaller data sets it doesn't matter, however, a better way is to use the CROSS APPLY method instead of a JOIN statement. Especially if your servers are being hit with many queries. The code would be:

SELECT p.Name, s.OrderQty
FROM Product p
CROSS APPLY (SELECT OrderQty 
FROM   SalesOrderDetail
WHERE  ProductID = p.ProductID ) s
Pricecutting answered 25/10, 2022 at 13:37 Comment(0)
I
-3

You did the right thing by checking from query plans. But I have 100% confidence in version 2. It is faster when the number off records are on the very high side.

My database has around 1,000,000 records and this is exactly the scenario where the query plan shows the difference between both the queries. Further, instead of using a where clause, if you use it in the join itself, it makes the query faster :
SELECT p.Name, s.OrderQty
FROM Product p
INNER JOIN (SELECT ProductID, OrderQty FROM SalesOrderDetail) s on p.ProductID = s.ProductID WHERE p.isactive = 1

The better version of this query is :

SELECT p.Name, s.OrderQty
FROM Product p
INNER JOIN (SELECT ProductID, OrderQty FROM SalesOrderDetail) s on p.ProductID = s.ProductID AND p.isactive = 1

(Assuming isactive is a field in product table which represents the active/inactive products).

Irrefragable answered 20/7, 2012 at 9:3 Comment(3)
-1 SQL Server will easily transform the first to the second. If you can provide a reproducible test case where it doesn't (rather than just an assertion) I will happily reverse it!Oversweet
You are absolutely right Martin, the effect is not visible with INNER JOIN, but for outer join it is visible. Obviously inner join would work same in both the scenarios. Due to confidentiality issue I cant give the snapshot of my db, but here is what I was talking about :blog.sqlauthority.com/2009/03/15/…Irrefragable
That blog you mentioned is related to the resultset and not related to performance and that too when you put some condition on right table in left join in where clause.Sabian

© 2022 - 2024 — McMap. All rights reserved.