is NATURAL JOIN any better than SELECT FROM WHERE in terms of performance? [duplicate]
Asked Answered
P

5

8

Possible Duplicate:
Inner join vs Where

Today I got into a debate with my project manager about Cartesian products. He says a 'natural join' is somehow much better than using 'select from where' because the later cause the db engine to internally perform a Cartesian product but the former uses another approach that prevents this. As far as I know, the natural join syntax is not any different in anyway than 'select from where' in terms of performance or meaning, I mean you can use either based on your taste.

SELECT * FROM table1,table2 WHERE table1.id=table2.id
SELECT * FROM table1 NATURAL JOIN table2

please elaborate about the first query causing a Cartesian product but the second one being somehow more smart

Panhellenism answered 17/6, 2010 at 15:28 Comment(8)
https://mcmap.net/q/99578/-inner-join-vs-whereBerylberyle
So basically, it depends on how smart the query optimizer is. Probably smart enough.Cecilia
I think your project manager is correct that the second syntax is preferred, but for the wrong reasons. The execution plan is the same (where does he get this Cartesian product nonsense?) but the JOIN syntax is somewhat easier to read, especially when dealing with more complex queries.Weimer
When I was first learning SQL, I wasn't using joins. But then, after realizing the elasticity of the query when using joins (left, right, inner, outer), I just started writing everything with joins. It's simply faster to edit from one type to another.Berylberyle
@gregcase - IMO, you should never use the first syntax with commas. If you want an inner join, use JOIN. If you want a Cross Join, use CROSS JOIN. Make your intent clear. The comma syntax should be outlawed IMO.Moslemism
Man, wouldn't it be wonderful if there we could actually outlaw bad practices? GOTO statements: first ones a misdeamoner, second one gets you jail time!Weimer
Doesn't answer the question about NATURAL JOIN. See my update.Halberd
The current accepted answer doesn't answer your question...Halberd
H
14

The correct way should be explicit with filters and joins separated

SELECT * FROM table1 JOIN table2 ON table1.id = table2.id

NATURAL JOINS may be easy and "clean" but more likely to be utterly unpredictable...

Edit, May 2012.

The accepted answer for the duplicate doesn't actually answer NATURAL JOIN.
These links discuss in further detail.

tl;dr

Performance isn't the issue: but your queries should be reliable and predictable which NATURAL JOIN certainly isn't.

"JOIN in the WHERE" aka implied JOIN aka what you call "Cartesian" is also bad as per these links (the same applies to Oracle as well as SQL Server)

Halberd answered 17/6, 2010 at 15:38 Comment(3)
+1: The more obvious things are, the easier they are to fix or improve.Jacobsohn
+1 Mercifully, a good number of DB vendors do not support Natural Joins. IMO, explicit joins are easier for the reader as they are not left with the mystery of determining the joining columns.Moslemism
In particular, say you have a second column like "is_active" that is common to both tables. That would be included in a natural join, and you'd likely get fewer rows than you were expecting.Calve
P
4

It depends.

A natural join links all columns in two tables with the same name. If the only two columns in tables 1 and 2 with the same name are ID, then the two queries should be evaluated identically by the optimiser; on the other hand, if there are more than two columns with the same name (or none at all) in the two tables, a completely different query is being performed.

In any case, a cartesian product will almost invariably (I'm tempted to say always) perform worse than any other type of join, as it joins every record of one table with every record of the other table.

How good is your manager at distinguishing his gluteus maximus from the upper end of his ulna?

Paulson answered 17/6, 2010 at 15:44 Comment(2)
The first query is not really cartesian: just a JOIN in the WHERE clause. I agree with your sentiments about anatomical knowledge thoughHalberd
Perhaps I should have said explicitly that a cartesian join will never occur for the first query, and only for the second if there are no matching columns in the two tables.Paulson
H
3

Performance-wise, there is no difference. Its been discussed over and over and over again. Googling for "join syntax oracle vs where" results in several good articles, including the one on this site referenced by Alexander.

However, be careful using a NATURAL JOIN. It will pick up on common columns like createdate or createuser or such that you normally really don't care about joining on and may cause problems. I highly recommended against NATURAL JOIN in production...just use INNER JOIN and specify the columns.

Even Tom agrees.

Hickman answered 17/6, 2010 at 15:37 Comment(0)
A
2

First thing to point out is that database optimizers interpret syntax in their own way. Obviously each product varies but I would be frankly astonished if any DBMS penalised what is the commonest mechanism for joining tables.

With regards to terminology, it is a cross join which generates a cartesian product. That is different from an inner join, and would generate a different result set.

Finally, natural joins are horrible, literally bugs waiting to happen. They should be avoided by all right-thinking people.

Asmara answered 17/6, 2010 at 15:38 Comment(0)
O
2

I would not use either syntax. Your query indicates an inner join, I would use the explicit syntax for that. You should not be using implied joins ever, they are subject to misinterpretation (was that an accidental cross join or did you mean to do that?) and accidental cross joins. Would you use C# code that was replaced 18 years agao with a better syntax (well actually C# didn't exist 18 years ago, but I think you understand what I'm saying)? Then why are you using outdated SQL code?

Not only is the implied join a problem for maintenance but it can be a big problem if you try to use the implied join syntax for outer joins as that does not work correctly in some databases and is also deprecated in at least one database, SQL Server, I know. And if you have the need for a filter on the table in the left join, you can't do that with the implied syntax at all becasue it will convert it to an innner join.

Yes your code works but it is a poor technique and you should get used to using the inner join explicitly, so that you are making your intent clear to furture maintainers and so you don't create accidental problems as you write more complex queries. If using the explicit syntax is not second nature for you, you will really struggle when you need to use if for something more complicated.

I have never in 30 years of querying databases seen a need to write a natural join and had to look up what one was, so using that is not more clear than the implied join.

Obstreperous answered 17/6, 2010 at 17:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.