I answered a recursive CTE yesterday that exposed an issue with the way that these are implemented in SQL Server (possibly in other RDBMS, too?). Basically, when I try to use ROW_NUMBER
against the current recursive level, it runs against each row subset of the current recursive level. I would expect that this would work in true SET logic, and run against the entire current recursive level.
It appears that, from this MSDN article, the issue I have found is intended functionality:
Analytic and aggregate functions in the recursive part of the CTE are applied to the set for the current recursion level and not to the set for the CTE. Functions like ROW_NUMBER operate only on the subset of data passed to them by the current recursion level and not the entire set of data pased to the recursive part of the CTE. For more information, see J. Using analytical functions in a recursive CTE.
In my digging, I could find nowhere that explains why this was chosen to work the way it does? This is more of a procedural approach in a set based language, so this works against my SQL thought process and is quite confusing in my opinion. Does anybody know and/or can anybody explain why the recursive CTE treats analytic functions at the recursion level in a procedural fashion?
Here is the code to help visualize this:
Notice, the RowNumber
column in each one of these code outputs.
Here is the SQLFiddle for the CTE (only showing the 2nd level of the recursion)
WITH myCTE
AS
(
SELECT *, ROW_NUMBER() OVER (ORDER BY Score desc) AS RowNumber, 1 AS RecurseLevel
FROM tblGroups
WHERE ParentId IS NULL
UNION ALL
SELECT tblGroups.*,
ROW_NUMBER() OVER (ORDER BY myCTE.RowNumber , tblGroups.Score desc) AS RowNumber,
RecurseLevel + 1 AS RecurseLevel
FROM tblGroups
JOIN myCTE
ON myCTE.GroupID = tblGroups.ParentID
)
SELECT *
FROM myCTE
WHERE RecurseLevel = 2;
WITH myCTE
AS
(
SELECT *, ROW_NUMBER() OVER (ORDER BY Score desc) AS RowNumber, 1 AS RecurseLevel
FROM tblGroups
WHERE ParentId IS NULL
)
SELECT tblGroups.*,
ROW_NUMBER() OVER (ORDER BY myCTE.RowNumber , tblGroups.Score desc) AS RowNumber,
RecurseLevel + 1 AS RecurseLevel
FROM tblGroups
JOIN myCTE
ON myCTE.GroupID = tblGroups.ParentID;
I always envisioned the SQL recursive CTE to run more like this while loop
DECLARE @RecursionLevel INT
SET @RecursionLevel = 0
SELECT *, ROW_NUMBER() OVER (ORDER BY Score desc) AS RowNumber, @RecursionLevel AS recurseLevel
INTO #RecursiveTable
FROM tblGroups
WHERE ParentId IS NULL
WHILE EXISTS( SELECT tblGroups.* FROM tblGroups JOIN #RecursiveTable ON #RecursiveTable.GroupID = tblGroups.ParentID WHERE recurseLevel = @RecursionLevel)
BEGIN
INSERT INTO #RecursiveTable
SELECT tblGroups.*,
ROW_NUMBER() OVER (ORDER BY #RecursiveTable.RowNumber , tblGroups.Score desc) AS RowNumber,
recurseLevel + 1 AS recurseLevel
FROM tblGroups
JOIN #RecursiveTable
ON #RecursiveTable.GroupID = tblGroups.ParentID
WHERE recurseLevel = @RecursionLevel
SET @RecursionLevel = @RecursionLevel + 1
END
SELECT * FROM #RecursiveTable ORDER BY RecurseLevel;
EXCEPT
as per this question – HalflightEXCEPT
will join that list. A Connect Item for ROW_NUMBER indicates that they did do this here too in 2008 but reversed it for some use case to do withhierarchyid
s – Halflight#RecursiveTable
that more closely mirrors the logical description. – Halflight