Oracle CONNECT BY clause after GROUP BY clause
Asked Answered
C

2

9

I have just run across this interesting article here, showing how to simulate wm_concat() or group_concat() in Oracle using a hierarchical query and window functions:

SELECT deptno,
       LTRIM(MAX(SYS_CONNECT_BY_PATH(ename,','))
       KEEP (DENSE_RANK LAST ORDER BY curr),',') AS employees
FROM   (SELECT deptno,
               ename,
               ROW_NUMBER() OVER (PARTITION BY deptno ORDER BY ename) AS curr,
               ROW_NUMBER() OVER (PARTITION BY deptno ORDER BY ename) -1 AS prev
        FROM   emp)
GROUP BY deptno
CONNECT BY prev = PRIOR curr AND deptno = PRIOR deptno
START WITH curr = 1;

Although, I find this not a very readable solution, it's quite interesting, specifically because the CONNECT BY .. STARTS WITH clause comes after the GROUP BY clause. According to the specification, this shouldn't be possible. I've tried this using a simple query and it does work, though! The following two queries return the same results:

-- wrong according to the specification:
select level from dual group by level connect by level <= 2;
-- correct according to the specification:
select level from dual connect by level <= 2 group by level;

Is this an undocumented feature? Or just syntax indifference for convenience? Or do the two statements subtly behave differently?

Creative answered 6/4, 2012 at 12:24 Comment(0)
A
4

I think this is just an insignificant syntax difference.

More specifically, I think this is a documentation bug. The syntax diagram for 8i implies that either order is supported. Nothing in the 8i reference implies the order makes any difference. But that diagram also kind of implies that you can have multiple group_by_clause or hierarchical_query, which isn't true:

--You can't group twice: ORA-01787: only one clause allowed per query block
select level from dual connect by level <= 2 group by level group by level;

My guess is that when Oracle fixed the syntax diagram for 9i they also forgot the order could be different. Or maybe they intentionally left it out, because it seems more logical to do the hierarchical part first.

There are several minor syntax variations like this that are undocumented. I don't think it means that they are unsupported. Oracle probably regrets allowing so many weird options and wants things to at least look simple. For example, HAVING can come before GROUP BY, many of the old parallel features still work (but are ignored), etc. (This is why I always laugh when people say they are going to quickly "parse SQL" - good luck figuring this out!)

Oracle 8i syntax: Oracle 8i SELECT syntax

Oracle 9i syntax: Oracle 9i SELECT syntax

Apostle answered 7/4, 2012 at 16:5 Comment(1)
Thanks for your research. That makes sense, of course. After all, in BNF notation, it can be quite tricky to document order indifference for two clauses, as can be seen in the documentation of CONNECT BY .. START WITH / START WITH .. CONNECT BY, where this becomes quite unreadable.Creative
M
3

Look at the execution plans. In my environment they are identical, with a CONNECT BY operation feeding into a HASH GROUP BY. So it appears that placing the GROUP BY first is just an odd syntax that produces the same result as the more natural ordering.

Technically, this is probably a bug in the parser, since as you say the spec indicates that the hierarchical-query clause should come prior to the group-by clause. But it doesn't appear to make any difference in how the query is executed.

Mellophone answered 6/4, 2012 at 13:12 Comment(1)
Nice hint. I also think this is a bug (or "hidden feature") of the parser. However, this is a trivial query, the execution plan (or even the result) still might differ in a more complex one...Creative

© 2022 - 2024 — McMap. All rights reserved.