Explicit vs implicit SQL joins
Asked Answered
H

12

495

Is there any efficiency difference in an explicit vs implicit inner join? For example:

SELECT * FROM
table a INNER JOIN table b
ON a.id = b.id;

vs.

SELECT a.*, b.*
FROM table a, table b
WHERE a.id = b.id;
Hyperbaric answered 4/9, 2008 at 22:49 Comment(11)
Good question. I'm curious why the explicit join is used at all. Is it not possible to do all queries without it?Tourniquet
use EXPLAIN keyword to know the difference about both the queries.. use JOIN and see the difference.. If you try in a table more than 100k records you can see the difference...Hardly
@Tourniquet My question was actually whether implicit join was a form of "hack" (as in "A query involving more than one table, not using a join? That's a hack isn't it?")Lavella
They are different, implicit joining will surprise you every once in a while when dealing with null values; use explicit joining and avoid bugs that arise when "nothing changed!"Tanyatanzania
There is no difference. , is CROSS JOIN with looser binding & INNER JOIN is CROSS JOIN with ON like WHERE but tighter binding. What matters to execution is how the DBMS optimizes queries.Crumpton
So the question should be how each DBMS deals with this issue. For example, I've read that Oracle recommends using explicit joins since v10, but documentation didn't mentioned if there's any difference related to performance.Reconcile
@Reconcile That Oracle explicit vs implicit is about avoiding old-style non-ANSI left join (+) notation, not about comma as inner join. Old left join syntax cannot handle all cases of keyword left join.Crumpton
@Crumpton what I was trying to say was: ok, if there is no difference between implicit and explicit join as you say (and as I already knew), the question then is if is there any difference in performance. I guess there is no difference, or shouldn't be any difference, but don't know it for sure. (Feel free to answer this implicit question).Reconcile
@Reconcile A DBMS that doesn't know that those are trivially equivalent queries has bigger performance problems than how it implements those. The justification is a textbook chapter on relational logical & physical query implementation/optimization plus the manual and/or implementation documentation for a specific version of a specific DBMS.Crumpton
I've been working on leetcode SQL problems; their environment seems to run on MySQL. And it is not accepting explicit joins, preferring implicit joins for "accepted" answers. Is this specific to leetcode, or does MySQL truly not support implicit joins?Of
Related question: https://mcmap.net/q/49646/-what-39-s-the-purpose-of-an-implicit-join-in-sql/521799Thrombophlebitis
D
172

Performance-wise, they are exactly the same (at least in SQL Server).

PS: Be aware that the "implicit OUTER JOIN" syntax--using *= or =* in a WHERE after using comma--is deprecated since SQL Server 2005. (The "implicit (CROSS) JOIN" syntax using comma as used in the question is still supported.)

Deprecation of "Old Style" JOIN Syntax: Only A Partial Thing

Dorchester answered 4/9, 2008 at 22:56 Comment(16)
@lomaxx, just for clarity's sake, could you specify which syntax of the 2 in the question is deprecated?Toomin
implicit join syntax is supported by SQL Server 2005 out of the box, but yes, it's a bad idea.Profitsharing
Although I prefer the explicit syntax, can you explain how can they be deprecating implicit joins? The idea that it could be deprecated seems odd and the suggestion that they aren't supported by SQL 2K5 is not corrrect.Robinette
Can you provide supporting documentation? This sounds wrong on multiple levels.Dosser
How do you deprecate the SQL standard?Pejoration
They are deprecating outer joins with the old syntax, not inner joins. The reason is that they are problematic to get correct, and are in some cases possible to satisfy with different execution plans that produce different results.Canotas
@david Crenshaw, the implicit join is no longer in the standard and hasn't been for 18 years.Resonate
In standard SQL, an implicit join isn't an outer join. It's either an inner join or a cross join.Caddish
Please provide supporting evidence and let me know somehow and I'll remove the downvote. Or at least mention the test and measurement that you used.Nigrify
So-called "implicit joins" of the 'inner' or 'cross' variety remain in the Standard. SQL Server is deprecating the "old-style" outer join syntax (i.e. *= and =*) which has never been Standard.Kilk
this is wrong.. if JOIN is not used there is problem in creating more number of rows by cross product. use EXPLAIN keyword to know the difference about the queries.Hardly
I've improved the answer with an external link and a clear distinction between the deprecated implict outer join syntax and the implicit inner join syntax that is used in the question.Shallot
@Resonate this is incorrect. Implicit join syntax (FROM a, b, c) is not being deprecated. It's still in 2003+ SQL standards.Cochin
Old (+) and *= outer join syntax was never part of standard SQL. And it has nothing to do with this question.Crumpton
I know this is an old thread, but I have this question and kind of oot, are we able to use implicit and explicit join in the same time?Morbid
@Morbid yes, you can mix the two syntaxes. it can be tricky - due to different precedence rules - but there is no rule against it. Example: FROM a, b JOIN c ON b.id=c.id would be accepted by any DBMS.Cochin
B
159

Personally I prefer the join syntax as its makes it clearer that the tables are joined and how they are joined. Try compare larger SQL queries where you selecting from 8 different tables and you have lots of filtering in the where. By using join syntax you separate out the parts where the tables are joined, to the part where you are filtering the rows.

Benignant answered 4/9, 2008 at 23:23 Comment(1)
I completely agree, but this is kind of off-topic. OP asked about efficiency.Soult
J
74

On MySQL 5.1.51, both queries have identical execution plans:

mysql> explain select * from table1 a inner join table2 b on a.pid = b.pid;
+----+-------------+-------+------+---------------+------+---------+--------------+------+-------+
| id | select_type | table | type | possible_keys | key  | key_len | ref          | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+--------------+------+-------+
|  1 | SIMPLE      | b     | ALL  | PRIMARY       | NULL | NULL    | NULL         |  986 |       |
|  1 | SIMPLE      | a     | ref  | pid           | pid  | 4       | schema.b.pid |   70 |       |
+----+-------------+-------+------+---------------+------+---------+--------------+------+-------+
2 rows in set (0.02 sec)

mysql> explain select * from table1 a, table2 b where a.pid = b.pid;
+----+-------------+-------+------+---------------+------+---------+--------------+------+-------+
| id | select_type | table | type | possible_keys | key  | key_len | ref          | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+--------------+------+-------+
|  1 | SIMPLE      | b     | ALL  | PRIMARY       | NULL | NULL    | NULL         |  986 |       |
|  1 | SIMPLE      | a     | ref  | pid           | pid  | 4       | schema.b.pid |   70 |       |
+----+-------------+-------+------+---------------+------+---------+--------------+------+-------+
2 rows in set (0.00 sec)

table1 has 166208 rows; table2 has about 1000 rows.

This is a very simple case; it doesn't by any means prove that the query optimizer wouldn't get confused and generate different plans in a more complicated case.

Jara answered 25/4, 2012 at 1:43 Comment(2)
This should be the accepted answer. This is correct, the plan is the same (or close to with bigger statements) but the amount of records will be drastic, thus causing difference in performance.Thrombosis
postgresql.org/docs/14/explicit-joins.html "Controlling the Planner with Explicit JOIN Clauses" has some comments that may be of interest for PostgreSQL. Also: dba.stackexchange.com/questions/198182/…Shipley
H
46

The second syntax has the unwanted possibility of a cross join: you can add tables to the FROM part without corresponding WHERE clause. This is considered harmful.

Hellgrammite answered 25/11, 2008 at 14:13 Comment(2)
What if the table names in the from clause are generated from the tables used in the where clause?Pammy
you can do a cross join with the explicit JOIN syntax as well.(https://mcmap.net/q/49647/-why-do-cross-join-conditions-not-work-in-the-39-on-39-clause-only-the-where-clause) you probably meant that it is less strict, thus more prone to user error.Ptisan
G
17

The first answer you gave uses what is known as ANSI join syntax, the other is valid and will work in any relational database.

I agree with grom that you should use ANSI join syntax. As they said, the main reason is for clarity. Rather than having a where clause with lots of predicates, some of which join tables and others restricting the rows returned with the ANSI join syntax you are making it blindingly clear which conditions are being used to join your tables and which are being used to restrict the results.

Guadiana answered 7/9, 2008 at 9:55 Comment(0)
P
7

@lomaxx: Just to clarify, I'm pretty certain that both above syntax are supported by SQL Serv 2005. The syntax below is NOT supported however

select a.*, b.*  
from table a, table b  
where a.id *= b.id;

Specifically, the outer join (*=) is not supported.

Pia answered 4/9, 2008 at 23:39 Comment(1)
Frankly I wouldn't use it even in SQL Server 2000, the *= syntax often gives wrong answers. Sometimes it interprets these as cross joins.Resonate
L
6

Performance wise, they are exactly the same (at least in SQL Server) but be aware that they are deprecating this join syntax and it's not supported by sql server2005 out of the box.

I think you are thinking of the deprecated *= and =* operators vs. "outer join".

I have just now tested the two formats given, and they work properly on a SQL Server 2008 database. In my case they yielded identical execution plans, but I couldn't confidently say that this would always be true.

Larena answered 4/9, 2008 at 23:33 Comment(0)
C
3

On some databases (notably Oracle) the order of the joins can make a huge difference to query performance (if there are more than two tables). On one application, we had literally two orders of magnitude difference in some cases. Using the inner join syntax gives you control over this - if you use the right hints syntax.

You didn't specify which database you're using, but probability suggests SQL Server or MySQL where there it makes no real difference.

Chorea answered 4/9, 2008 at 23:38 Comment(2)
Leigh, you can use the hints in implicit joins too.Pulpy
In Oracle it is extremely rare for the join order to affect the execution plan in a meaningful way. See this article by Jonathan Lewis for an explanation.Hauser
S
2

As Leigh Caldwell has stated, the query optimizer can produce different query plans based on what functionally looks like the same SQL statement. For further reading on this, have a look at the following two blog postings:-

One posting from the Oracle Optimizer Team

Another posting from the "Structured Data" blog

I hope you find this interesting.

Shrapnel answered 17/9, 2008 at 17:44 Comment(1)
Mike, the difference they are talking about is that you need to be sure that if you specify an explicit join, you specify the join condition to join on, not the filter. You will note that for semantically correct queries, the exec plan is the same.Pulpy
S
2

Basically, the difference between the two is that one is written in the old way, while the other is written in the modern way. Personally, I prefer the modern script using the inner, left, outer, right definitions because they are more explanatory and makes the code more readable.

When dealing with inner joins there is no real difference in readability neither, however, it may get complicated when dealing with left and right joins as in the older method you would get something like this:

SELECT * 
FROM table a, table b
WHERE a.id = b.id (+);

The above is the old way how a left join is written as opposed to the following:

SELECT * 
FROM table a 
LEFT JOIN table b ON a.id = b.id;

As you can visually see, the modern way of how the script is written makes the query more readable. (By the way same goes for right joins and a little more complicated for outer joins).

Going back to the boiler plate, it doesn't make a difference to the SQL compiler how the query is written as it handles them in the same way. I've seen a mix of both in Oracle databases which have had many people writing into it, both elder and younger ones. Again, it boils down to how readable the script is and the team you are developing with.

Shroyer answered 2/5, 2019 at 11:0 Comment(0)
H
0

Performance wise, it should not make any difference. The explicit join syntax seems cleaner to me as it clearly defines relationships between tables in the from clause and does not clutter up the where clause.

Hickox answered 30/11, 2011 at 18:30 Comment(0)
E
-1

In my experience, using the cross-join-with-a-where-clause syntax often produces a brain damaged execution plan, especially if you are using a Microsoft SQL product. The way that SQL Server attempts to estimate table row counts, for instance, is savagely horrible. Using the inner join syntax gives you some control over how the query is executed. So from a practical point of view, given the atavistic nature of current database technology, you have to go with the inner join.

Epirus answered 13/8, 2015 at 18:9 Comment(1)
Do you have any proof of this? Because the accepted answer says otherwise.Robins

© 2022 - 2024 — McMap. All rights reserved.