Does the order of tables in a join matter, when LEFT (outer) joins are used?
Asked Answered
M

3

8

I would like to confirm that the SQL query

SELECT ....
  FROM apples,
       oranges
       LEFT JOIN kiwis ON kiwis.orange_id = oranges.id,
       bananas
 WHERE ....

is exactly equivalent to other permutations in the FROM subclause, like

SELECT ....
  FROM oranges
       LEFT JOIN kiwis ON kiwis.orange_id = oranges.id,
       bananas,
       apples
 WHERE ....

or

SELECT ....
  FROM bananas,
       apples,
       oranges
       LEFT JOIN kiwis ON kiwis.orange_id = oranges.id
 WHERE ....

as long as the explicit LEFT JOIN between oranges and kiwis remains intact. From what I've read in various documents, the returned set should be exactly the same.

I'm really only concerned with the results of the query, not its performance in an actual database. (I'm using PostgreSQL 8.3, which AFAIK doesn't support optimizer hints about the join order, and will try to create an optimal query plan automatically).

Maugham answered 15/2, 2011 at 21:14 Comment(3)
Did you test it? The order of JOIN s is irrelevant unless you need specific results from previous JOIN sKylstra
Is bananas,apples, oranges a cartesian product? Or "join-in-WHERE" clause?Ornithine
EXPLAIN will show you: postgresql.org/docs/current/static/sql-explain.html And there is no reason for any hint at all, PostgreSQL is pretty smart.Maggs
O
15

It is the same but it is ambiguous as hell with the implicit CROSS JOINs. Use explicit JOINS.

If you are joining in the WHERE clause then the results may differ because joins and filters are mixed up.

SELECT ....
  FROM apples a
       JOIN
       bananas b ON ...
       JOIN 
       oranges o ON ...
       LEFT JOIN
       kiwis k ON k.orange_id = o.id
 WHERE (filters only)

Notes:

  • INNER JOINS and CROSS JOINS are commutative and associative: order does not matter usually.
  • OUTER JOINS are not, which you identified
  • SQL is declarative: you tell the optimiser what you want, not how to do it. This removes JOIN order considerations (subject to the previous 2 items)
Ornithine answered 15/2, 2011 at 21:28 Comment(1)
A very good and thorough answer, thank you. For the record, yes, the join conditions are in the WHERE clause. This is a legacy application, and I'm in the process of refactoring some of the queries.Maugham
B
1

The situation is summarized at Controlling the Planner with Explicit JOIN Clauses. Outer joins don't get reordered, inner ones can be. And you can force a particular optimizer order by dropping *join_collapse_limit* before running the query, and just putting things in the order you want them it. That's how you "hint" at the database in this area.

In general, you want to use EXPLAIN to confirm what order you're getting, and that can sometimes be used to visually confirm that two queries are getting the same plan.

Burkhart answered 15/2, 2011 at 23:44 Comment(0)
C
0

I've done SQL for donkeys years, and in all my experience, the table order does not matter. The database will look at the query as a whole and create the optimal query plan. That is why database companies employ many people with PhD in query plan optimisations.

The database vendor would commit commercial suicide if it optimised by the order in which you personally listed the SQL in your query.

Christmas answered 15/2, 2011 at 21:25 Comment(1)
Older versions of DBMS actually did it that way (e.g. Oracle 7 & 8). This was usually called a "rule based optimizer" as it statically looked at the query and applied some rules based on the structure of the query to determine the execution plan. With a rule based optimizer the first table was usually the driving table and so the order did have a major influence on the performanceDeferent

© 2022 - 2024 — McMap. All rights reserved.