How can I create a closure table using data from an adjacency list?
Asked Answered
D

3

13

I have a database containing a hierarchy of categories stored using the adjacency list model.

The hierarchy is 3 levels deep (not including an imaginary root node) and contains approx 1700 nodes. Nodes in the 2nd and 3rd levels can have multiple parents. A additional table is used for the many-to-many relationship as below:

CREATE TABLE dbo.Category(
    id int IDENTITY(1,1) NOT NULL,
    name varchar(255) NOT NULL,
)

CREATE TABLE dbo.CategoryHierarchy(
    relId int IDENTITY(1,1) NOT NULL,
    catId int NOT NULL,
    parentId int NOT NULL,
)

If I move to using the transitive closure table method (for the sake of data integrity etc) is there a relatively easy query I can execute that would generate the values for the closure table? (using SQL Server 2005)

I've look through articles and presentations such as Bill Karwin's Models for hierarchical data but that only has insertion queries for a single node and it would take forever for me to create my tree like that.

Thanks.

EDIT:
RelID in the CategoryHierarchy table is purely for the sake of a primary key, it has no bearing on the node ids of the Category table.

Also by closure table, I mean a table like this:

CREATE TABLE ClosureTable (
    ancestor int NOT NULL,
    descendant int NOT NULL,
    [length] int NOT NULL,
)

Where the first two columns are a compound primary key, and are individually foreign keys to Category.id.

Defant answered 27/9, 2012 at 12:54 Comment(0)
D
7

I think I've been able to work out the solution myself.

If anyone has a better way of doing this, please comment.

IF OBJECT_ID('dbo.ClosureTable', 'U') IS NOT NULL
    DROP TABLE dbo.ClosureTable
GO

CREATE TABLE dbo.ClosureTable (
    ancestor int NOT NULL,
    descendant int NOT NULL,
    distance int NULL
)
GO

DECLARE @depth INT
SET @depth = 1

INSERT INTO dbo.ClosureTable (ancestor, descendant, distance)
SELECT catid, catid, 0 FROM dbo.Category -- insert all the self-referencing nodes

WHILE (@depth < 4) -- my tree is only 4 levels deep, i.e 0 - 3
BEGIN
    INSERT INTO dbo.ClosureTable (ancestor, descendant, distance)
    SELECT ct.ancestor, h.catid, @depth
    FROM dbo.ClosureTable ct INNER JOIN dbo.CategoryHierarchy h ON ct.descendant = h.parentid
    WHERE ct.distance = @depth - 1

    SET @depth = @depth + 1
END

Cheers :)

Defant answered 27/9, 2012 at 16:38 Comment(0)
S
20

I was trying to figure out the same thing, but wanted it in a recursive CTE. This wouldn't have worked for you (SQL Server 2008+), but here's what I ended up with for anyone else looking.

The key is that the anchors aren't your root nodes (where parent_id IS NULL), but instead all your zero depth rows-to-be in the closure table.

Table

CREATE TABLE dbo.category (
    id         INT IDENTITY(1, 1) NOT NULL,
    parent_id  INT                    NULL
)

Data

INSERT INTO dbo.category (id, parent_id)
VALUES
    (1, NULL),
    (2, 1),
    (3, 1),
    (4, 2)

CTE

WITH category_cte AS
(
    SELECT
        id AS ancestor,
        id AS descendant,
        0  AS depth
    FROM dbo.category

    UNION ALL

    SELECT
        CTE.ancestor  AS ancestor,
        C.id          AS descendant,
        CTE.depth + 1 AS depth
    FROM dbo.category AS C
    JOIN category_cte AS CTE
        ON C.parent_id = CTE.descendant
)
SELECT * FROM category_cte

Result

ancestor descendant depth
-------- ---------- -----
1        1          0     <- anchor query
2        2          0
3        3          0
4        4          0
2        4          1     <- first recursive query
1        2          1
1        3          1
1        4          2     <- second recursive query
Schreibman answered 15/4, 2015 at 0:26 Comment(0)
D
7

I think I've been able to work out the solution myself.

If anyone has a better way of doing this, please comment.

IF OBJECT_ID('dbo.ClosureTable', 'U') IS NOT NULL
    DROP TABLE dbo.ClosureTable
GO

CREATE TABLE dbo.ClosureTable (
    ancestor int NOT NULL,
    descendant int NOT NULL,
    distance int NULL
)
GO

DECLARE @depth INT
SET @depth = 1

INSERT INTO dbo.ClosureTable (ancestor, descendant, distance)
SELECT catid, catid, 0 FROM dbo.Category -- insert all the self-referencing nodes

WHILE (@depth < 4) -- my tree is only 4 levels deep, i.e 0 - 3
BEGIN
    INSERT INTO dbo.ClosureTable (ancestor, descendant, distance)
    SELECT ct.ancestor, h.catid, @depth
    FROM dbo.ClosureTable ct INNER JOIN dbo.CategoryHierarchy h ON ct.descendant = h.parentid
    WHERE ct.distance = @depth - 1

    SET @depth = @depth + 1
END

Cheers :)

Defant answered 27/9, 2012 at 16:38 Comment(0)
R
3
  WITH Children (id, name, iteration) AS (
    SELECT id, name, 0
    FROM Category 
    --WHERE id = @folderid -- if you want a startpoint
    UNION ALL 
    SELECT b.id, b.name, a.iteration + 1
    FROM Children AS a, Category AS b, CategoryHierarchy c
    WHERE a.id = c.parentId
      AND b.id = c.catId 
  )
  SELECT id, name, iteration FROM Children

Not tested, but should work like this. At least a start how you do this fast without loops.

EDIT: I missred, not relId, it is parentId that has to link to Childrens table. But the result should be a table with all tables. So this is not what your originally wanted?

Ruelas answered 27/9, 2012 at 13:12 Comment(2)
Thanks @YvesR, however that doesn't really help. The closure table would only have the columns: ancestor, descendant, depth. Also RelId is just an additional column for the sake of a primary key - it has no bearing on the node ids. Sorry for any confusion.Defant
Edited my post. I thought you wanted a table result with all rows and its relations? If you need more informations you can join them after as well.Ruelas

© 2022 - 2024 — McMap. All rights reserved.