SQL JOIN: is there a difference between USING, ON or WHERE?
Asked Answered
A

6

53

I was wondering if there is any difference in the way SQL performs on these join statements:

SELECT * FROM a,b WHERE a.ID = b.ID

SELECT * FROM a JOIN b ON a.ID = b.ID

SELECT * FROM a JOIN b USING(ID)

Is there a performance difference? Or algorithmic difference?

Or is it just syntactic sugar?

Antabuse answered 13/4, 2011 at 19:8 Comment(7)
I always use the 'ON' version if at all possible, just to make it explicitly clear what the join condition is. On a long query, the where could be miles away from the join, leaving you wondering what it's for.Kittle
Have tried looking at the query evaluation plan using explain command?Teplica
Anyone know if it's documented somewhere that there's no actual difference ?Antabuse
possible duplicate of SQL INNER JOIN questionAquaplane
+1 for kicking off a good discussion despite being asked fairly often.Fermata
Just an FYI - usage of USING keyword for equi joins isn't supported in SQL Server. USING keyword is used to tell the query batch, which database to use for firing the queries in the batch.Paulita
USING returns only one copy of each specified column, unlike comma & ON.Manion
F
56

There is no difference in performance.

However, the first style is ANSI-89 and will get your legs broken in some shops. Including mine. The second style is ANSI-92 and is much clearer.

Examples:

Which is the JOIN, which is the filter?

FROM T1,T2,T3....
WHERE T1.ID = T2.ID AND
     T1.foo = 'bar' AND T2.fish = 42 AND
     T1.ID = T3.ID

FROM T1 
   INNER JOIN T2 ON T1.ID = T2.ID
   INNER JOIN T3 ON T1.ID = T3.ID
WHERE
   T1.foo = 'bar' AND T2.fish = 42

If you have OUTER JOINs (=*, *=) then the 2nd style will work as advertised. The first most likely won't and is also deprecated in SQL Server 2005+

The ANSI-92 style is harder to bollix too. With the older style you can easily end up with a Cartesian product (cross join) if you miss a condition. You'll get a syntax error with ANSI-92.

Edit: Some more clarification

  • The reason for not using "join the where" (implicit) is the dodgy results with outer joins.
  • If you use explicit OUTER JOINs + implicit INNER JOINs you'll still get dodgy results + you have inconsistency in usage

It isn't just syntax: it's about having a semantically correct query

Edit, Dec 2011

SQL Server logical query processing order is FROM, ON, JOIN, WHERE...

So if you mix "implicit WHERE inner joins" and "explicit FROM outer joins" you most likely won't get expected results because the query is ambiguous...

Fermata answered 13/4, 2011 at 19:13 Comment(10)
Do you know if it's documented somewhere that it's just a syntax change ?Antabuse
ANSI-92 supersedes ANSI-89? Practically, it's about clarity and maintainability. You seem to object to the JOIN syntax...Fermata
@Yochai Timmer 92 > 89 last I checked.Casque
@Yochai Timmer: I'll try again... The reason for not using "join the where" (implicit) is the danger with outer joins. Then if you use explicit OUTER JOINs + implicit INNER JOINs you'll still get dodgy results + you have inconsistency in usage. It isn't syntax: it's semanticsFermata
@Yochai Timmer - so what other documentation are you looking for?@Fermata hit it right on the spot. He is giving you solid information and good programming practices.Casque
"Which is the JOIN, which is the filter?" -- these terms are subjective so the question is loaded: cannot answer without seeing the full query and knowing its purpose. You could say for INNER JOIN it makes not difference. If they were OUTER JOINs changing (what you imply are) the filter conditions into join conditions the query results could potentially change based on NULLs but I couldn't tell whether this is 'wrong' without knowing the designer's intention.Peh
@onedaywhen: not sure what you're trying to say. Do you endorse using deprecated OUTER JOIN in the WHERE syntax? Do you understand why OUTER JOIN in the WHERE is ambiguous? Do you not like clarity and semantic correctness?Fermata
@Peh don't you mean one often 'filters' for the NULL produced by the ON clause in the WHERE clause?Moderato
@ScottEdwards2000: point taken: I shouldn't try to use colloquialisms or encourage others to do so!Peh
The only difference between comma & cross join is comma has lower precedence. Although that's a good reason to not mix them. Comma has always been ANSI/ISO/standard.Manion
C
4

I despise when you force a join by using WHERE. It just doesn't to me look right, a dirty hack. Proper ANSI join is to use ON:

SELECT 
    p.Product,
    o.Order
FROM 
    Product p
INNER JOIN
    Order o
ON
    o.OrderID = p.OrderID

Prefer using ON when joining and WHERE to filter results. Remember WHERE is one of the last things you will use besides grouping and order by where you want to filter your results. So you shouldn't join your tables using WHERE as it is much difficult to read.

SELECT 
    p.Product,
    o.Order
FROM 
    Product p
INNER JOIN
    Order o
ON
    o.OrderID = p.OrderID
WHERE
    o.Category = 'IT'

In the end you (the developer) might not be around in the future so readability and maintainability will help the pour soul who has to take over your code :).

When I see developers use WHERE to join their tables it's usually an indication that they don't know enough T-SQL. That is my personal opinion.

Casque answered 13/4, 2011 at 19:10 Comment(7)
the OP did not ask your opinion on whether it is 'pretty' not to use the ANSI join syntax, just about possible differences in implementationDetach
@Detach - I think my post clarifies exactly his question.Casque
@jeroenh: it isn't about pretty: it about clarity, maintainability, and using something that has been standard for 20 years nearly. +1. Are we not here to teach and encourage best practice?Fermata
I fully agree that proper JOIN syntax is more readable and better maintainable and I withdrew my downvote after the edit. But I maintain my position: the question is about possible performance/algorithmic differences, not about syntax. So this still does not answer the question. @oJmHo can we hold a decent discussion please instead of calling each other names?Detach
@Detach - That is fine, in no way was I saying one is better then the other performance wise. It's the proper way to do things, just as it is proper for you to shower when you wake up in the morning :). You can get away with not showering but someone will smell your odor. Just as its proper to use proper coding conventions, otherwise someone will smell the bad code from the good code.Casque
ha ha. I assumed implcit joins could be classed as a code smell then...?Fermata
You realise that the JOIN syntax hasn’t always been available, so the WHERE syntax is what was available at the time. That’s changed, of course, but it does explain why some may have got used to it. However, I agree with the notion of using JOIN for joins, and WHERE for other filters.Cyanosis
T
4

The difference is readability and maintainability. SELECT * FROM a JOIN b ON a.ID = b.ID conveys your exact intent, all in the same place.

I won't say definitively since I haven't gotten under the hood of the last query optimizer, but I'm pretty confident you're looking at a trivial difference in performance, if any.

Truong answered 13/4, 2011 at 19:12 Comment(0)
D
3

No one has provided an answer about the USING(...) syntax yet.

While these two queries are equivalent logically, and also from the perspective of most modern optimisers:

SELECT * FROM a, b WHERE a.id = b.id
SELECT * FROM a JOIN b ON a.id = b.id

This one has a slightly different semantics:

SELECT * FROM a JOIN b USING (id)

Assuming the following schema:

CREATE TABLE a (id int, a1 int, a2 int);
CREATE TABLE b (id int, b1 int, b2 int);

The first two queries will have their asterisks expanded to:

SELECT a.id, a.a1, a.a2, b.id, b.a1, b.a2 FROM ...

Whereas the third query will have its asterisk expanded to:

SELECT coalesce(a.id, b.id) AS id, a.a1, a.a2, b.a1, b.a2 FROM ...

This is quite different for various reasons, including:

  • The number of projected columns is now 5 instead of 6. This could get in the way when you use UNION or other set operations. Which you probably don't, with an asterisk.
  • There is no longer a qualified (and duplicate) a.id or b.id column, only an id column. While PostgreSQL still allows for qualifying references to id (e.g. when needing to disambiguate them), Oracle for example doesn't.
  • As a consequence, in the third query with the USING(...) syntax, projecting a.* or b.* is no longer possible in Oracle.
Dreibund answered 5/3, 2020 at 16:50 Comment(0)
H
0

See this

INNER JOIN ON vs WHERE clause

Holbert answered 13/4, 2011 at 19:14 Comment(0)
F
0

This is a duplicate of this SO question: Explicit vs implicit SQL joins. Generally I think the implicit (where version) is bad form and not as clear as the explicit (on version). I also think the implicit is being depreciated but not 100% on that one. The execution plan is the same for both though.

Foolscap answered 13/4, 2011 at 19:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.