How to do a full outer join without having full outer join available
Asked Answered
P

4

5

Last week I was surprised to find out that sybase 12 doesn't support full outer joins. But it occurred to me that a full outer join should be the same as a left outer join unioned with a right outer join of the same sql. Can anybody think of a reason this would not hold true?

Prosper answered 18/1, 2010 at 22:23 Comment(2)
What happened when you tried that?Tremolite
I got what appeared to be the right data, but one test case doesn't necessarily answer the question fully.Prosper
W
2

UNION-ing two OUTER JOIN statements should result in duplicate rows representing the data you'd get from an INNER JOIN. You'd have to probably do a SELECT DISTINCT on the data set produced by the UNION. Generally if you have to use a SELECT DISTINCT that means it's not a well-designed query (or so I've heard).

Winna answered 18/1, 2010 at 22:54 Comment(1)
@Josh. DISTINCT does more than is required here; just exclude ALL. Yes, if you have to use DISTINCT or OUTER JOIN that's a sure sign that there is something wrong; with the db design, not the code.Butch
G
5

UNION ALL the left join with the right join, but limit the right join to only rows that do not exist in the base table (return null on the join when they would not be null in the table if they existed).

For this code you will need to create two tables t1 and t2. t1 should have one column named c1 with five rows containing the values 1-5. t2 should also have a c1 column with five rows containing the values 2-6.

Full Outer Join:

select * from t1 full outer join t2 on t1.c1=t2.c1 order by 1, 2;

Full Outer Join Equivalent:

select t1.c1, t2.c1 from t1 left join  t2 on t1.c1=t2.c1
union all
select t1.c1, t2.c1 from t1 right join t2 on t1.c1=t2.c1 
where t1.c1 is null
order by 1, 2;

Note the where clause on the right joined select that limits the results to only those that would not be duplicates.

Gigantean answered 10/1, 2011 at 18:47 Comment(5)
@Leigh. 1) You cannot put a NULL in a NOT NULL column, the code will not work 2) This is oracle code, we do not have dual or row_ids in Sybase.Butch
1) I thought you meant what you had stated; I didn't realise you were stating the definition of an Outer Join which stu obviously understands (refer his post). Cleared 2) Remove the ALL ( as detailed in 2 other answers) and the WHERE clause. First you are includes dupes; then you exclude the dupes; it is cheaper and more direct to refrain from selecting the dupes in the first place, by excluding the ALL, then the WHERE is not required.Butch
@Butch Removing the ALL is specifically so that the UNION can remove the duplicates. My version is eliminating the duplicates before the UNION ALL, while the UNION versions are eliminating them on the UNION. You may be right, the later may be cheaper depending on the data set; regardless this solution is still valid.Gigantean
According to the FAQ, we are encouraged to correct & improve answer. Sybase question. It does the removal of dupes during the UNION. I did not vote you down. I see your purpose now. Worth testing re cheaper.Butch
@Butch Indeed, thank you for encouraging me to correct/improve this answer. It may not be the best solution, but it is at least a solution to be considered.Gigantean
W
2

UNION-ing two OUTER JOIN statements should result in duplicate rows representing the data you'd get from an INNER JOIN. You'd have to probably do a SELECT DISTINCT on the data set produced by the UNION. Generally if you have to use a SELECT DISTINCT that means it's not a well-designed query (or so I've heard).

Winna answered 18/1, 2010 at 22:54 Comment(1)
@Josh. DISTINCT does more than is required here; just exclude ALL. Yes, if you have to use DISTINCT or OUTER JOIN that's a sure sign that there is something wrong; with the db design, not the code.Butch
G
1

If you union them with UNION ALL, you'll get duplicates. If you just use UNION without the ALL, it will filter duplicates and therefore be equivalent to a full join, but the query will also be a lot more expensive because it has to perform a distinct sort.

Gerald answered 18/1, 2010 at 23:13 Comment(5)
A lot more expensive than union all or a lot more expensive than a native full outer join?Prosper
@stu: Both. Even though a native full join is technically the same thing as a UNION, database engines that support full joins will have a way to optimize it. Having said that - full joins are still pretty slow and worth avoiding if you can. I don't think I ever use them.Gerald
Aaronaught. That is not correct. Sybase ASE 12 sort for this purpose is extremely fast, it uses internal work tables and sort buffers. The enterprise class engines are quite different to the freeware.Butch
@Perf: I don't use the "freeware", so I wouldn't know. And I didn't say that it was slow, I said it was a lot more expensive than UNION ALL, which is true. UNION ALL is simply concatenation, which is practically the cheapest set operation you can do.Gerald
I have 20 years of Sybase and have not heard that terminology before, I was guessing. Sorry. Sure, an UNION has one additional step over an UNION ALL, but Sybase does a Hash Join while streaming the data, so it is not a "lot more expensive", just a little. Which of course results in fewer rows, but that processing has to be done somewhere in the process; the single SQL command is probably the best location. I never use outer joins of any kind.Butch
B
0
  1. Well first, I don't know why you are using 12.x. It was EndOfLifed on 31 Dec 2009, after having been notified on 3 Apr 2007. 15.0.2 (first solid version) came out in Jan 2009. 15.5 is much better and was available 02 Dec 2009, so you are two major releases, and over at least 13 months, out of date.

  2. ASE 12.5.4 has the new Join syntax. (you have not specified, you may be on 12.5.0.3, the release prior to that).

  3. DB2 and Sybase did not implement FULL OUTER JOIN, for precisely the reason you have identified: it is covered by LEFT ... UNION ... RIGHT without ALL. It is not a case of "not supporting" a FOJ; it is a case of the keyword is missing.

  4. And then you have the issue that Sybase and DB2 types would generally never use outer joins let alone FOJs, because their databases tend to be more normalised, etc.

  5. Last, there is perfectly ordinary SQL you can use in any version of Sybase that will supply the function of FOJ, and will be distinctly faster on 12.x; only marginally faster on 15.x. It is kind of like the RANK() function: quite unnecessary if you can write a Subquery.

  6. The second reason it does not need FULL OUTER, as some of the low end engines do, is because the new optimiser is extremely fast, and the query is fully normalised. Ie. it performs the LEFT and the RIGHT in a single pass.

  7. Depending on you SARGs and DataType mismatches, etc it may still have to sort-merge, but that too is streamed at all three levels: disk I/O subsystem; engine(s); and network handler. If your tables are partitioned, then it is additionally parallelised at that level.

  8. If your server is not configured and your result set is very large, you may need to increase proc cache size and number of sort buffers. That's all.

Butch answered 19/1, 2011 at 11:26 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.