Query furthest children in Adjacency List
Asked Answered
C

3

6

So I have an SQL query to retrieve all the children of a given node in an adjacency list.

WITH    RECURSIVE
        q AS
        (
        SELECT  id, name
        FROM    categories h
        WHERE   id = 11846801
        UNION ALL
        SELECT  hc.id, hc.name
        FROM    q
        JOIN    categories hc
        ON      hc.parent = q.id
        )
SELECT  name
FROM    q

Is there a way to modify this query to return me just the bottom level of nodes? I can't just specify a given level as each path may have a different depth.

Conjoined answered 4/2, 2013 at 22:23 Comment(4)
On second thought, by "bottom" level, so you mean all leave nodes or only leave nodes with the longest path from the start. My answer is for the second interpretation.Schreib
Yea, see my comment below. I was overthinking the problem. When it's simply find all nodes that dont have a parent reference pointing to it. About a million and one ways to solve that.Conjoined
But you probably want the fastest ...Schreib
True, although for my use case a few msec in DB time is a small percentage of runtime.Conjoined
S
3

Interpretation 1

"All leave nodes with the longest path from the start."

One way would be to count the levels on your way down and only return members of the bottom one:

WITH RECURSIVE q AS (
   SELECT  id, name, 0 AS lvl
   FROM    categories
   WHERE   id = 11846801

   UNION ALL
   SELECT  c.id, c.name, q.lvl + 1
   FROM    q
   JOIN    categories c ON c.parent = q.id
   )
SELECT  id, name
FROM    q
WHERE   lvl = (SELECT max(lvl) FROM q);

Interpretation 2

"All leave nodes."

WITH RECURSIVE q AS (
   SELECT  id, name, parent
   FROM    categories
   WHERE   id = 11846801

   UNION ALL
   SELECT  c.id, c.name, c.parent
   FROM    q
   JOIN    categories c ON c.parent = q.id
   )
SELECT  id, name
FROM    q
WHERE   NOT EXISTS (SELECT FROM q q1 WHERE q1.parent = q.id);

It should be faster to check on q than on the base table - unless q is very big, in which case indexes on the main table may be faster.

Schreib answered 4/2, 2013 at 22:42 Comment(0)
G
2

The things at the bottom are never parents. So, you can add the following where clause:

where id not in (select parent from categories)

Actually, in Postgres, not in may not be the most efficient method. So, this might be more efficient:

where not exists (select 1 from categories c where c.parent = q.id)
Gleeson answered 4/2, 2013 at 22:47 Comment(0)
M
1

Just to iterate Gordon Linoff's answer. His query will work is great idea, but only if there are no NULL values. Thus to counter this, an additional where clause is needed.

SELECT id WHERE id NOT IN (SELECT parent FROM categories WHERE parent IS NOT NULL);

I only found out about this today. (and not enough rep to comment)

To comply with the SQL standard, IN returns NULL not only if the expression on the left hand side is NULL, but also if no match is found in the list and one of the expressions in the list is NULL.

Reference http://dev.mysql.com/doc/refman/5.6/en/comparison-operators.html#function_in

Montemayor answered 26/9, 2015 at 11:37 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.