Oracle USING clause best practice
Asked Answered
P

3

12

Disclaimer: I'm a developer and not a DBA.

I've been a huge fan of the USING clause in Oracle since I accidentally stumbled upon it and have used it in place of the old-fashioned ON clause to join fact tables with dimension tables ever since. To me, it creates a much more succinct SQL and produces a more concise result set with no unnecessary duplicated columns.

However, I was asked yesterday by a colleague to convert all my USING clauses into ONs. I will check with him and ask him what his reasons are. He works much more closely with the database than I do, so I assume he has some good reasons.

I have not heard back from him (we work in different timezones), but I wonder if there are any guidelines or best practices regarding the use of the "using" clause? I've googled around quite a bit, but have not come across anything definitive. In fact, I've not even even a good debate anywhere.

Can someone shed some light on this? Or provide a link to a good discussion on the topic?

Thank you!

Prothonotary answered 23/4, 2013 at 6:58 Comment(1)
This post may be of some helpCryobiology
E
28

You're presumably already aware of the distinction, but from the documentation:

ON condition Use the ON clause to specify a join condition. Doing so lets you specify join conditions separate from any search or filter conditions in the WHERE clause.

USING (column) When you are specifying an equijoin of columns that have the same name in both tables, the USING column clause indicates the columns to be used. You can use this clause only if the join columns in both tables have the same name. Within this clause, do not qualify the column name with a table name or table alias.

So these would be equivalent:

select e.ename, d.dname
from emp e join dept d using (deptno);

select e.ename, d.dname
from emp e join dept d on d.deptno = e.deptno;

To a large extent which you use is a matter of style, but there are (at least) two situations where you can't use using: (a) when the column names are not the same in the two tables, and (b) when you want to use the joining column:

select e.ename, d.dname, d.deptno
from emp e join dept d using(deptno);

select e.ename, d.dname, d.deptno
                         *
ERROR at line 1:
ORA-25154: column part of USING clause cannot have qualifier

You can of course just leave off the qualifier and select ..., deptno, as long as you don't have another table with the same column that isn't joined using it:

select e.ename, d.dname, deptno
from emp e join dept d using (deptno) join mytab m using (empno);

select e.ename, d.dname, deptno
                         *
ERROR at line 1:
ORA-00918: column ambiguously defined

In that case you can only select the qualified m.deptno. (OK, this is rather contrived...).

The main reason I can see for avoiding using is just consistency; since you sometimes can't use it, occasionally switching to on for those situations might be a bit jarring. But again that's more about style than any deep technical reason.

Perhaps your colleague is simply imposing (or suggesting) coding standards, but only they will know that. It also isn't quite clear if you're being asked to change some new code you've written that is going through review, or old code. If it's the latter then regardless of the reasons for them preferring on, I think you'd need to get a separate justification for modifying proven code, as there's a risk of introducing new problems even when the modified code is retested - quite apart from the cost/effort involved in the rework and retesting.

A couple of things strike me about your question though. Firstly you describes the on syntax as 'old-fashioned', but I don't think that's fair - both are valid and current (as of SQL:2011 I think, but citation needed!). And this:

produces a more concise result set with no unnecessary duplicated columns.

... which I think suggests you're using select *, otherwise you would just select one of the values, albeit with a couple of extra characters for the qualifier. Using select * is generally considered bad practice (here for example) for anything other than ad hoc queries and some subqueries.

Elyseelysee answered 23/4, 2013 at 7:51 Comment(3)
Extremely detailed and helpful answer. Accepted.Prothonotary
To the two "concerns" you raised in your pose: a. I'm only calling ON "old-fashioned" because it's been around longer. It's definitely still valid. I absolutely accept that old doesn't always mean worse. :) b. Re "select *". I'm not using it, but definitely appreciate the reminder. I was more thinking about the "virtual resultset" which my sql needs to select from. So I guess what I meant to say is I do not need to qualify the column, because I feel a bit silly being forced to qualify a column when I know the qualifier doesn't matter (in the case of an inner join).Prothonotary
I've run into your last case with an effdt column used for auditing across all tables. effdt also happen to be a key between two related tables but not another table that I need to include in the join. It's unfortunate that there seems to be no way around this.Microfarad
M
2

Related question.

It seems the main difference is syntactic: the columns are merged in a USING join.

In all cases this means that you can't access the value of a joined column from a specific table, in effect some SQL will not compile, for example:

SQL> WITH t AS (SELECT 1 a, 2 b, 3 c FROM dual),
  2       v AS (SELECT 1 a, 2 b, 3 c FROM dual)
  3  SELECT t.* FROM t JOIN v USING (a);

SELECT t.* FROM t JOIN v USING (a)
         ^    
ORA-25154: column part of USING clause cannot have qualifier

In an outer join this means you can't access the outer table value:

SQL> WITH t AS (SELECT 1 a, 2 b, 3 c FROM dual),
  2       v AS (SELECT NULL a, 2 b, 3 c FROM dual)
  3  SELECT * FROM t LEFT JOIN v USING (a)
  4   WHERE v.a IS NULL;

 WHERE v.a IS NULL
         ^
ORA-25154: column part of USING clause cannot have qualifier

This means that there is no equivalent for this anti-join syntax with the USING clause:

SQL> WITH t AS (SELECT 1 a, 2 b, 3 c FROM dual),
  2       v AS (SELECT NULL a, 2 b, 3 c FROM dual)
  3  SELECT * FROM t LEFT JOIN v ON v.a = t.a
  4   WHERE v.a IS NULL;

         A          B          C A          B          C
---------- ---------- ---------- - ---------- ----------
         1          2          3  

Apart from this, I'm not aware of any difference once the SQL is valid.

However, since it seems this syntax is less commonly used, I wouldn't be surprised if there were specific bugs that affect only the USING clause, especially in early versions where ANSI SQL was introduced. I haven't found anything on MOS that could confirm this, partly because the USING word is ubiquitous in bug descriptions.

If the reason for not using this feature is because of bugs, it seems to me the burden of the proof lies with your colleague: the bugs must be referenced/documented, so that the ban can eventually be lifted once the bugs are patched (database upgrade...).

If the reason is cosmetic or part of a coding convention, surely it must be documented too.

Monjan answered 23/4, 2013 at 7:51 Comment(3)
You can access the value in the first instance, you just can't include the qualifier - so ... WHERE a IS NULL would work, though I personally find it strange to not qualify something, particularly if other columns are qualified. I guess it makes sense since you've said they values must be the same, but still... The second case is interesting though, I hadn't come across that.Elyseelysee
@AlexPoole I somewhat disagree: in an OUTER join, the values from the join condition can be distinct. You can't write an anti-join with a USING clause because the columns are merged. For instance there is no equivalence for this query: WITH t AS (SELECT 1 a, 2 b, 3 c FROM dual), v AS (SELECT NULL a, 2 b, 3 c FROM dual) SELECT * FROM t LEFT JOIN v ON v.a = t.a WHERE v.a IS NULL;Monjan
Can't argue with that *8-). I guess that's similar to Bob's example in the question you linked to.Elyseelysee
B
0

With USING you also cannot do a join like: select a.id,aval,bval,cval from a left join b on a.id = b.id left join c on c.id = b.id;

that is, only give the column from C when it is matched to a row in the B table.

Bezant answered 3/2, 2015 at 16:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.