How to SELECT immediate children and ancestors all in the same query
Asked Answered
B

4

6

I'm working with a tree structure in MySQL that is respresented using the nested sets model.

I'm hoping some of you sql experts can help me with building a SELECT query.

I would like to be able to match a set of nodes using LIKE. For each node that is matched, I also need a comma-delimmited list of the ancestors of that node, and a comma-delimmited list of the immediate children of that node.

I'm not really sure where to start with this - if such a thing is even possible in a single query. (Currently I am accomplishing this with a query inside a loop.) What I'm hoping for is a result set that might look something like this....

Starting with the string "qu" and querying the Table "Body" I get...

Node      | Parent Nodes               | Immediate Children
Quads       Leg, Lower Body, Muslces     Vastus Lateralus, Vastus Medialis, Rectus Femoris
Obliques    Core, Trunk, Muscles         Inner obliques, outer obliques

Any suggestions on how to accomplish this without looping queries would be much appreciated.

Bagworm answered 23/2, 2010 at 14:38 Comment(4)
I currently don't have time to write up a complete answer, but if you have access to the book SQL Cookbook, the chapter on heirarchical queries covers using MySQL to solve this problem. For MySQL, you need to know branch-depth to use the queries, but that isn't a problem - you know how deep you want to go. The solution isn't super-trivial, but it is possible.Upstroke
@sheepsimulator - thanks, I'll see if I can track that book down.Bagworm
sheepsimulator, i disagree. it's trivial to solve if you know what depth you need - the holy grail is doing it for arbitrary depth. Or well not holy grail, but ya'know, arbitrary depth is what makes this problem interesting IMOHudson
@ Roland - Concur. But Travis is lucky, he doesn't have to.Upstroke
H
0

While I agree with nickf that this is bad and dirty, it's still fun, so here goes:

SELECT     base.left_id, base.ancestors, 
           GROUP_CONCAT(children.left_id) children
FROM       (
            SELECT     base.left_id
            ,          GROUP_CONCAT(ancestors.left_id) ancestors
            FROM       nested_set   base
            LEFT JOIN  nested_set   ancestors
            ON         base.left_id     BETWEEN ancestors.left_id 
                                            AND ancestors.right_id
            WHERE      base.name  LIKE '%criteria%'
            GROUP BY   base.left_id
           ) base                                    
LEFT JOIN  nested_set   children
ON         children.left_id BETWEEN base.left_id 
                                AND base.right_id
LEFT JOIN  nested_set   inbetween
ON         inbetween.left_id BETWEEN base.left_id 
                                AND base.right_id
AND        children.left_id  BETWEEN inbetween.left_id 
                                AND inbetween.right_id     
WHERE      inbetween.left_id IS NULL
GROUP BY   base.left_id

Basically, the trick is to solve it in two steps: first, solve the ancestors problem, and squash the ancestors to a list with, then, use this result to solve it for the children.

The ancestors part is relatively easy, it is the subquery in the from clause in my solution. The children is a bit harder. It works by taking all descendents, and then requiring that there do not exist any nodes between the base node and the descendents, which basically restricts the descendents to only the children.

There are other variations to this strategy to solve this - for example you can do the children first, and solve the ancestors using a subquery in the SELECT list.

Hudson answered 23/2, 2010 at 14:48 Comment(3)
Thanks for your reply. I will spend some time working with your code snippet above to see if I can get it to work for me. As an aside, you also suggest that this would be a bad thing to actually implement. Is it really a bad idea? I have to assume a solution like this is going to be significantly faster than looping smaller queries. Thanks again.Bagworm
Well, the thing is, the nested set is a real hassle to get right. First up front to maintain the data in this structure, then to query it. What I particularly don't like is that a small change way down left in the tree essentially rewrites the entire tree top and right of that local change: a scalability nightmare. Then I look at typical trees I deal with: they are typically quite small, mostly read, and relatively static. I usually end up with an adjacency list + app code, or a materialized path, or a closure table depending on what I need at the moment.Hudson
I had to change a couple of tiny things to get it to work with my setup, but it works like a charm. Since speed is a big concern for me, I'm going to use this. Thanks again so much.Bagworm
F
0

In one query? I wouldn't bother. The SQL would be horrendous, and probably not even that efficient. Split each bit up into logical smaller queries: first find all the matching nodes, then for each of those, the extra info you need.

Fletcherfletcherism answered 23/2, 2010 at 14:44 Comment(3)
That's what I'm currently doing. I start by selecting the nodes that match using LIKE. Then I loop through them, and query separately for ancestors and children on each iteration. Depending on the size of the initial resultset, that can be a lot of queries...Bagworm
okay I get that, but so what? I still would prefer reading/maintaining several smaller queries which perform a single, well understood function, rather than a single behemoth which is illegible to all but the super gurus.Fletcherfletcherism
Well, I'm no super guru. :) I'm just trying to optimize my SQL. I've always understood that as soon as you start looping your sql queries, disaster is right around the corner.Bagworm
L
0

I'm not 100% sure what you want but if I understand correctly you could achieve this using a normalized database schema and subqueries.

For example:

Table "nodes" Table "node_parents"

The "nodes" table would store all nodes, and "node_parents" would map relationships between different nodes.

So when you select LIKE a certain node you could grab all its parents and children from node_parents.

You could grab the extra info using joins or subqueries.

Loth answered 23/2, 2010 at 14:47 Comment(2)
The problem is that you can only handle one level with that approach.Hudson
on second, thought, yeah this is messy, you wouldn't be able to do it using just 1 query afaik. you would need to at the very least use a stored procedure to loopLoth
H
0

While I agree with nickf that this is bad and dirty, it's still fun, so here goes:

SELECT     base.left_id, base.ancestors, 
           GROUP_CONCAT(children.left_id) children
FROM       (
            SELECT     base.left_id
            ,          GROUP_CONCAT(ancestors.left_id) ancestors
            FROM       nested_set   base
            LEFT JOIN  nested_set   ancestors
            ON         base.left_id     BETWEEN ancestors.left_id 
                                            AND ancestors.right_id
            WHERE      base.name  LIKE '%criteria%'
            GROUP BY   base.left_id
           ) base                                    
LEFT JOIN  nested_set   children
ON         children.left_id BETWEEN base.left_id 
                                AND base.right_id
LEFT JOIN  nested_set   inbetween
ON         inbetween.left_id BETWEEN base.left_id 
                                AND base.right_id
AND        children.left_id  BETWEEN inbetween.left_id 
                                AND inbetween.right_id     
WHERE      inbetween.left_id IS NULL
GROUP BY   base.left_id

Basically, the trick is to solve it in two steps: first, solve the ancestors problem, and squash the ancestors to a list with, then, use this result to solve it for the children.

The ancestors part is relatively easy, it is the subquery in the from clause in my solution. The children is a bit harder. It works by taking all descendents, and then requiring that there do not exist any nodes between the base node and the descendents, which basically restricts the descendents to only the children.

There are other variations to this strategy to solve this - for example you can do the children first, and solve the ancestors using a subquery in the SELECT list.

Hudson answered 23/2, 2010 at 14:48 Comment(3)
Thanks for your reply. I will spend some time working with your code snippet above to see if I can get it to work for me. As an aside, you also suggest that this would be a bad thing to actually implement. Is it really a bad idea? I have to assume a solution like this is going to be significantly faster than looping smaller queries. Thanks again.Bagworm
Well, the thing is, the nested set is a real hassle to get right. First up front to maintain the data in this structure, then to query it. What I particularly don't like is that a small change way down left in the tree essentially rewrites the entire tree top and right of that local change: a scalability nightmare. Then I look at typical trees I deal with: they are typically quite small, mostly read, and relatively static. I usually end up with an adjacency list + app code, or a materialized path, or a closure table depending on what I need at the moment.Hudson
I had to change a couple of tiny things to get it to work with my setup, but it works like a charm. Since speed is a big concern for me, I'm going to use this. Thanks again so much.Bagworm
E
0

This question is a lot harder then I anticipated in your other post, but I disagree with the first posters answer.

I am quite confident that it is possible with a single query.

You need to use SUBQUERIES and selects. Have you ever looked at the really good demo on the mySQL website about Adjacency List Model.

Because you can LIKE for the "Node" you can then use Sub queries for your SQL query to get all of the parents and parent. One of my queries which I did something like this on was absolutely massive! But it worked.

Take a look : http://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/

This is just a little code snippet which shows how the immediate children is found.

SELECT node.name, (COUNT(parent.name) - (sub_tree.depth + 1)) AS depth
FROM nested_category AS node,
    nested_category AS parent,
    nested_category AS sub_parent,
    (
        SELECT node.name, (COUNT(parent.name) - 1) AS depth
        FROM nested_category AS node,
        nested_category AS parent
        WHERE node.lft BETWEEN parent.lft AND parent.rgt
        AND node.name = 'PORTABLE ELECTRONICS'
        GROUP BY node.name
        ORDER BY node.lft
    )AS sub_tree
WHERE node.lft BETWEEN parent.lft AND parent.rgt
    AND node.lft BETWEEN sub_parent.lft AND sub_parent.rgt
    AND sub_parent.name = sub_tree.name
GROUP BY node.name
HAVING depth <= 1
ORDER BY node.lft
Emmi answered 23/2, 2010 at 14:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.