Why do Recursive CTEs run analytic functions (ROW_NUMBER) procedurally?
Asked Answered
C

1

12

I answered a recursive CTE yesterday that exposed an issue with the way that these are implemented in SQL Server (possibly in other RDBMS, too?). Basically, when I try to use ROW_NUMBER against the current recursive level, it runs against each row subset of the current recursive level. I would expect that this would work in true SET logic, and run against the entire current recursive level.

It appears that, from this MSDN article, the issue I have found is intended functionality:

Analytic and aggregate functions in the recursive part of the CTE are applied to the set for the current recursion level and not to the set for the CTE. Functions like ROW_NUMBER operate only on the subset of data passed to them by the current recursion level and not the entire set of data pased to the recursive part of the CTE. For more information, see J. Using analytical functions in a recursive CTE.

In my digging, I could find nowhere that explains why this was chosen to work the way it does? This is more of a procedural approach in a set based language, so this works against my SQL thought process and is quite confusing in my opinion. Does anybody know and/or can anybody explain why the recursive CTE treats analytic functions at the recursion level in a procedural fashion?


Here is the code to help visualize this:

Notice, the RowNumber column in each one of these code outputs.

Here is the SQLFiddle for the CTE (only showing the 2nd level of the recursion)

WITH myCTE
AS
(
  SELECT *, ROW_NUMBER() OVER (ORDER BY Score desc) AS RowNumber, 1 AS RecurseLevel
  FROM tblGroups
  WHERE ParentId IS NULL

  UNION ALL

  SELECT tblGroups.*, 
      ROW_NUMBER() OVER (ORDER BY myCTE.RowNumber , tblGroups.Score desc) AS RowNumber, 
      RecurseLevel + 1 AS RecurseLevel
  FROM tblGroups
      JOIN myCTE
          ON myCTE.GroupID = tblGroups.ParentID
 )
SELECT *
FROM myCTE
WHERE RecurseLevel = 2;

Here is the second SQLFiddle for what I would expect the CTE to do (again only need the 2nd level to display the issue)

WITH myCTE
AS
(
  SELECT *, ROW_NUMBER() OVER (ORDER BY Score desc) AS RowNumber, 1 AS RecurseLevel
  FROM tblGroups
  WHERE ParentId IS NULL
 )
  SELECT tblGroups.*, 
      ROW_NUMBER() OVER (ORDER BY myCTE.RowNumber , tblGroups.Score desc) AS RowNumber, 
      RecurseLevel + 1 AS RecurseLevel
  FROM tblGroups
      JOIN myCTE
          ON myCTE.GroupID = tblGroups.ParentID;

I always envisioned the SQL recursive CTE to run more like this while loop

DECLARE @RecursionLevel INT
SET @RecursionLevel = 0
SELECT *, ROW_NUMBER() OVER (ORDER BY Score desc) AS RowNumber, @RecursionLevel AS recurseLevel
INTO #RecursiveTable
FROM tblGroups
WHERE ParentId IS NULL

WHILE EXISTS( SELECT tblGroups.* FROM tblGroups JOIN #RecursiveTable ON #RecursiveTable.GroupID = tblGroups.ParentID WHERE recurseLevel = @RecursionLevel)
BEGIN

    INSERT INTO #RecursiveTable
    SELECT tblGroups.*, 
        ROW_NUMBER() OVER (ORDER BY #RecursiveTable.RowNumber , tblGroups.Score desc) AS RowNumber, 
        recurseLevel + 1 AS recurseLevel
    FROM tblGroups
        JOIN #RecursiveTable
            ON #RecursiveTable.GroupID = tblGroups.ParentID
    WHERE recurseLevel = @RecursionLevel
    SET @RecursionLevel = @RecursionLevel + 1
END

SELECT * FROM #RecursiveTable ORDER BY RecurseLevel;
Crick answered 1/4, 2012 at 14:53 Comment(7)
All recursive CTEs currently get the same basic plan where it adds the rows to a spool that acts as a stack and then processes that row by row using nested loops. Similar issue with EXCEPT as per this questionHalflight
@MartinSmith Yes, I understand that, but my question is why does it do that, when it could easily treat this is a set based recursion. That is SQL's strength, not this procedural methodology.Crick
Don't know. Guess simpler or more efficient implementation? Most functions that expose difference between logical and physical implementation are banned. EXCEPT will join that list. A Connect Item for ROW_NUMBER indicates that they did do this here too in 2008 but reversed it for some use case to do with hierarchyidsHalflight
@MartinSmith Hrmmm, I am not on the same page as Paul White (as in I still disagree), but I do understand where they are coming from with the reasoning. If you post your comment as an answer, I will accept as that is what I was looking for...MS reasoningCrick
You might want to look at these blog posts: explainextended.com/2009/11/18/… and this one: explainextended.com/2009/11/23/recursive-ctes-postgresqlMuslim
@Justin I haven't really answered your question though as to why it is implemented like that with a stack spool rather than a work table like #RecursiveTable that more closely mirrors the logical description.Halflight
This seems to be an implementation detail in SQL Server. For PostgreSQL the first statement returns what you seem to expect: sqlfiddle.com/#!1/4c6ec/1 and Oracle seems to work the same way as SQL Server: sqlfiddle.com/#!4/4c6ec/13Muslim
O
1

Analytic functions are special in the way that they need a known resultset to resolve. They depend on the following, preceding or full resultset to caculate current value. That said, merging view is never allowed on a view that contains an analytic function. Why? That will change the result.

Ex:

    Select * from (
      select row_number() over (partition by c1 order by c2) rw, c3 from t) z
    where c3=123

is not the same as

    select row_number() over (partition by c1 order by c2) rw, c3 from t 
    where c3=123

These 2 will return different values for rw. That's why sub-queries containing analytic functions will always be fully resolved before and never be merged with the rest.

Update

Looking at the 2nd query:

WITH myCTE
AS
(
  SELECT *, ROW_NUMBER() OVER (ORDER BY Score desc) AS RowNumber, 1 AS RecurseLevel
  FROM tblGroups
  WHERE ParentId IS NULL
 )
  SELECT tblGroups.*, 
      ROW_NUMBER() OVER (ORDER BY myCTE.RowNumber , tblGroups.Score desc) AS RowNumber, 
      RecurseLevel + 1 AS RecurseLevel
  FROM tblGroups
      JOIN myCTE
          ON myCTE.GroupID = tblGroups.ParentID;

It works exactly as if it was written like (Same execution plan and result) :

SELECT tblGroups.*, 
      ROW_NUMBER() OVER (ORDER BY myCTE.RowNumber , tblGroups.Score desc) AS RowNumber, 
      RecurseLevel + 1 AS RecurseLevel
FROM tblGroups
JOIN (
    SELECT *, ROW_NUMBER() OVER (ORDER BY Score desc) AS RowNumber, 1 AS RecurseLevel
    FROM tblGroups
    WHERE ParentId IS NULL
    )myCTE ON myCTE.GroupID = tblGroups.ParentID;

This one needs to be partitioned to reset the rownumber.

Recursive queries don't work in a while loop, they are not procedural. At the base, they work like a recursive function, but depending on the tables, the query, the indexes, they can be optimized to run one way or the other.

If we do follow the concept that view cannot be merged when using analytic functions, and looking at query 1. It can only run once way, and it's in nested loop.

WITH myCTE
AS
( /*Cannot be merged*/
  SELECT *, ROW_NUMBER() OVER (ORDER BY Score desc) AS RowNumber, 1 AS RecurseLevel,
  cast(0 as bigint) n
  FROM tblGroups
  WHERE ParentId IS NULL

  UNION ALL

/*Cannot be merged*/
  SELECT tblGroups.*, 
      ROW_NUMBER() OVER (ORDER BY myCTE.RowNumber, tblGroups.Score desc) AS RowNumber,       RecurseLevel + 1 AS RecurseLevel,
  myCTE.RowNumber
  FROM tblGroups
      JOIN myCTE
          ON myCTE.GroupID = tblGroups.ParentID
 )
SELECT *
FROM myCTE;

So 1st select, cannot be merged 2nd, neither. The only way to run this query is in nested loop for each item returned in each level, hence the reset. Again, it's not a question of procedural or not, just a question of possible execution plan.

Hope this answers you question, let me if it doesn't:)

y

Obsidian answered 14/4, 2012 at 14:43 Comment(6)
Thanks, but I already understand that about analytic functions. The question is not about how analytic functions work, but how they function within the CTE. They are functioning in a more procedural fashion, rather than the typical SQL set logic. Notice my first query is pretty much the same thing as my second one (I am just picking out the results from the second iteration). Please re-read my entire question and hopefully it will make more sense as to what I am expecting.Crick
Sorry, I'm new to stackoverflow. Here is what I can see, 2nd query is not a CTE. It's just a regular join. The 1st is a CTE. Rownumber (analytic functions) work the same in CTE like any other query. And they do not work in a "while loop" fashion. CTE is a self-joined joined recursive sql (At least in Oracle:-)). Not sure if this answers your question, if it doesn't, please let me know, this got me very curious!Obsidian
The 2nd is a CTE, just not a recursive one. That was my mistake in my comment for not stating that clearer (it should have read ...but how they function within the recursive CTE). What I am saying is that a recursive CTE is essentially a while loop. The 1st query is a fully recursive CTE where I'm only outputting the 1st iteration (after the base set)...but notice the RowNumber resetting. The 2nd is just a regular CTE where I am using the output to run the same query I have in my first (essentially mimicking what should happen in the 1st iteration)..notice RowNumber does NOT resetCrick
You're right CTE but not recursive. The 2nd one has no reason to reset, You're only asking row_number ordered by CTE rownumber. The query returns 4 rows, abcd ordered by CTE (1,2)&score. If yo want a reset, it should be (partition by myCTE.RowNumber order by tblGroups.Score desc). The 1st query is resetting everytime you change level/parent.Obsidian
No, I do not want a reset, and the 1st query is resetting every time I change parent/subitem. As far as I see it, query 1 and query 2 should be the same for the first iteration. I dont know how else to explain it. Please make sure to look at each example and its output carefully. Hopefully, you will see what I am talking about.Crick
I think I see your point. It's more efficient: For each row in the top (outer) input, scan the bottom (inner) input and output matching rows than store each level's output in a temp & run against that b/c you'd have to store, run next, store, run next... VS smashing the CTE into one giant SQL statement? Kind of creating a branching JOIN (Each UNION ALL results in N joins based on N results from the top output?) Although, it would have to evaluate each along the way to find out when it was done...but I guess it wouldnt have to store temp, just check output. Is that what you were getting at?Crick

© 2022 - 2024 — McMap. All rights reserved.