recursive query for adjacency list to preorder tree traversal in SQL?
Asked Answered
F

1

7

I am migrating data from one database schema to another. The old schema has a categorization system based on an adjacency list, with id, category, and parent_id. If one category is under a second, that category has the second's id as its parent id. For example:

+-------------+----------------------+--------+
| category_id | name                 | parent |
+-------------+----------------------+--------+
|           1 | ELECTRONICS          |   NULL |
|           2 | TELEVISIONS          |      1 |
|           3 | TUBE                 |      2 |
|           4 | LCD                  |      2 |
|           5 | PLASMA               |      2 |
|           6 | PORTABLE ELECTRONICS |      1 |
|           7 | MP3 PLAYERS          |      6 |
|           8 | FLASH                |      7 |
|           9 | CD PLAYERS           |      6 |
|          10 | 2 WAY RADIOS         |      6 |
+-------------+----------------------+--------+

The new schema has a modified preorder tree traversal algorithm:

+-------------+----------------------+-----+-----+
| category_id | name                 | lft | rgt |
+-------------+----------------------+-----+-----+
|           1 | ELECTRONICS          |   1 |  20 |
|           2 | TELEVISIONS          |   2 |   9 |
|           3 | TUBE                 |   3 |   4 |
|           4 | LCD                  |   5 |   6 |
|           5 | PLASMA               |   7 |   8 |
|           6 | PORTABLE ELECTRONICS |  10 |  19 |
|           7 | MP3 PLAYERS          |  11 |  14 |
|           8 | FLASH                |  12 |  13 |
|           9 | CD PLAYERS           |  15 |  16 |
|          10 | 2 WAY RADIOS         |  17 |  18 |
+-------------+----------------------+-----+-----+

Examples taken from the article Managing Hierarchical Data in MySQL.

Anyhow, I'm capable or writing a php script with a recursive function that will migrate the adjacency list to the preorder tree structure. Basically for each row, it inserts it with a blank 'rgt' value, looks for children, applies the function recursively to them, keeping track of the counter, and then updates the 'rgt' value.

But I want to do this in pure SQL. However, I don't know enough to get a foothold on it. For starters, I don't know if you can do this with a recursive query, or if there are other ways of doing it.

Flare answered 1/4, 2011 at 3:57 Comment(3)
Do you have a known number of hierarchical levels - is this a static data set, or does it need to work on 'any' data set?Mixologist
In the data set I'm working with, it's more or less static, so there are known levels.Flare
But I really am looking for a general purpose algorithm.Flare
F
6

Here's what I've got; it's an example and can be generalized or adapted to your situation.

First I'll set up an adjacency list (language family data from wikipedia):

CREATE TABLE IF NOT EXISTS `language_family_adj_list` (
  `language_id` int(11) NOT NULL auto_increment,
  `language` varchar(20) NOT NULL,
  `parent_id` int(11) default NULL,
  PRIMARY KEY  (`language_id`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=41 ;


INSERT INTO `language_family_adj_list` (`language_id`, `language`, `parent_id`) VALUES
(1, 'Finno-Ugric', NULL),
(2, 'Hungarian', 1),
(3, 'Khanty', 1),
(4, 'Mansi', 1),
(5, 'Permic', 1),
(6, 'Mari', 1),
(7, 'Mordvinic', 1),
(8, 'Sami', 1),
(9, 'Baltic-Finnic', 1),
(10, 'Komi', 5),
(11, 'Komi-Permyak', 5),
(12, 'Udmurt', 5),
(13, 'Erzya', 7),
(14, 'Moksha', 7),
(15, 'Western Sami', 8),
(16, 'Eastern Sami', 8),
(17, 'Southern Sami', 15),
(18, 'Umi Sami', 15),
(19, 'Lule Sami', 15),
(20, 'Pite Sami', 15),
(22, 'Northern Sami', 15),
(23, 'Kemi Sami', 16),
(24, 'Inari Sami', 16),
(25, 'Akkala Sami', 16),
(26, 'Kildin Sami', 16),
(27, 'Skolt Sami', 16),
(28, 'Ter Sami', 16),
(29, 'Estonian', 9),
(30, 'Finnish', 9),
(31, 'Ingrian', 9),
(32, 'Karelian', 9),
(33, 'Livonian', 9),
(34, 'Veps', 9),
(35, 'Votic', 9),
(36, 'South Estonian', 29),
(37, 'Voro', 36),
(38, 'Karelian Proper', 32),
(39, 'Lude', 32),
(40, 'Olonets Karelian', 32);

Here's a query to demonstrate:

mysql> SELECT t1.language AS lev1, t2.language as lev2, t3.language as lev3, t4.language as lev4, t5.language AS lev5
    -> FROM language_family_adj_list AS t1
    -> LEFT JOIN language_family_adj_list AS t2 ON t2.parent_id = t1.language_id
    -> LEFT JOIN language_family_adj_list AS t3 ON t3.parent_id = t2.language_id
    -> LEFT JOIN language_family_adj_list AS t4 ON t4.parent_id = t3.language_id
    -> LEFT JOIN language_family_adj_list AS t5 ON t5.parent_id = t4.language_id
    -> WHERE t1.parent_id IS NULL
    -> ORDER BY t1.language, t2.language, t3.language, t4.language, t5.language;
+-------------+---------------+--------------+------------------+------+
| lev1        | lev2          | lev3         | lev4             | lev5 |
+-------------+---------------+--------------+------------------+------+
| Finno-Ugric | Baltic-Finnic | Estonian     | South Estonian   | Voro | 
| Finno-Ugric | Baltic-Finnic | Finnish      | NULL             | NULL | 
| Finno-Ugric | Baltic-Finnic | Ingrian      | NULL             | NULL | 
| Finno-Ugric | Baltic-Finnic | Karelian     | Karelian Proper  | NULL | 
| Finno-Ugric | Baltic-Finnic | Karelian     | Lude             | NULL | 
| Finno-Ugric | Baltic-Finnic | Karelian     | Olonets Karelian | NULL | 
| Finno-Ugric | Baltic-Finnic | Livonian     | NULL             | NULL | 
| Finno-Ugric | Baltic-Finnic | Veps         | NULL             | NULL | 
| Finno-Ugric | Baltic-Finnic | Votic        | NULL             | NULL | 
| Finno-Ugric | Hungarian     | NULL         | NULL             | NULL | 
| Finno-Ugric | Khanty        | NULL         | NULL             | NULL | 
| Finno-Ugric | Mansi         | NULL         | NULL             | NULL | 
| Finno-Ugric | Mari          | NULL         | NULL             | NULL | 
| Finno-Ugric | Mordvinic     | Erzya        | NULL             | NULL | 
| Finno-Ugric | Mordvinic     | Moksha       | NULL             | NULL | 
| Finno-Ugric | Permic        | Komi         | NULL             | NULL | 
| Finno-Ugric | Permic        | Komi-Permyak | NULL             | NULL | 
| Finno-Ugric | Permic        | Udmurt       | NULL             | NULL | 
| Finno-Ugric | Sami          | Eastern Sami | Akkala Sami      | NULL | 
| Finno-Ugric | Sami          | Eastern Sami | Inari Sami       | NULL | 
| Finno-Ugric | Sami          | Eastern Sami | Kemi Sami        | NULL | 
| Finno-Ugric | Sami          | Eastern Sami | Kildin Sami      | NULL | 
| Finno-Ugric | Sami          | Eastern Sami | Skolt Sami       | NULL | 
| Finno-Ugric | Sami          | Eastern Sami | Ter Sami         | NULL | 
| Finno-Ugric | Sami          | Western Sami | Lule Sami        | NULL | 
| Finno-Ugric | Sami          | Western Sami | Northern Sami    | NULL | 
| Finno-Ugric | Sami          | Western Sami | Pite Sami        | NULL | 
| Finno-Ugric | Sami          | Western Sami | Southern Sami    | NULL | 
| Finno-Ugric | Sami          | Western Sami | Umi Sami         | NULL | 
+-------------+---------------+--------------+------------------+------+
29 rows in set (0.00 sec)

So here's the modified preorder traversal tree table schema:

CREATE TABLE language_family_mptt (
 language VARCHAR(30) NOT NULL,
 lft INT NOT NULL,
 rgt INT NOT NULL
) COLLATE utf8;

And then here's the recursive stored procedure to migrate the data:

TRUNCATE TABLE language_family_mptt;
SET max_sp_recursion_depth = 255;
DROP PROCEDURE IF EXISTS insert_branches;
DROP PROCEDURE IF EXISTS start_tree;
DELIMITER ~~

CREATE PROCEDURE start_tree()
BEGIN
    DECLARE language_field VARCHAR(100);
    DECLARE done INT DEFAULT 0;
    DECLARE insert_id INT;
    DECLARE source_id INT;

    DECLARE cursor1 CURSOR FOR SELECT language, language_id FROM language_family_adj_list WHERE parent_id IS NULL ORDER BY language;

    DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;

    OPEN cursor1;
    read_loop: LOOP

        SET @my_left = 1;

        FETCH cursor1 INTO language_field, source_id;
        INSERT INTO language_family_mptt ( language, lft ) VALUES ( language_field, 1 );

        CALL insert_branches( source_id );

        UPDATE language_family_mptt SET rgt = @my_left + 1 WHERE lft = 1 AND rgt = 0;

        IF done THEN
            LEAVE read_loop;
        END IF;

    END LOOP;
    CLOSE cursor1;

END; ~~

CREATE PROCEDURE insert_branches( IN source_parent_id INT )
BEGIN
    DECLARE done INT DEFAULT 0;
    DECLARE next_source_parent_id INT DEFAULT NULL;
    DECLARE orig_left INT DEFAULT NULL;
    DECLARE language_field VARCHAR(100);    
    DECLARE cursor1 CURSOR FOR SELECT language_id, language FROM language_family_adj_list WHERE parent_id = source_parent_id ORDER BY language;
    DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;

    OPEN cursor1;

    read_loop: LOOP

        FETCH cursor1 INTO next_source_parent_id, language_field;

        IF done THEN
            LEAVE read_loop;
        END IF;

        SET @my_left = @my_left + 1;

        INSERT INTO language_family_mptt ( language, lft ) VALUES ( language_field, @my_left ); 

        SET orig_left = @my_left;

        CALL insert_branches( next_source_parent_id );

        UPDATE language_family_mptt SET rgt = @my_left + 1 WHERE lft = orig_left AND rgt = 0 ;

        SET @my_left = @my_left + 1;

    END LOOP;
    CLOSE cursor1;
END; ~~

DELIMITER ;

And here's the results:

mysql> SELECT CONCAT( REPEAT( ' ', (COUNT(parent.language) - 1) ), node.language) AS name FROM language_family_mptt AS node, language_family_mptt AS parent WHERE node.lft BETWEEN parent.lft AND parent.rgt GROUP BY node.language ORDER BY node.lft;
+---------------------------------+
| name                            |
+---------------------------------+
|    Finno-Ugric                  | 
|        Baltic-Finnic            | 
|            Estonian             | 
|                South Estonian   | 
|                    Voro         | 
|            Finnish              | 
|            Ingrian              | 
|            Karelian             | 
|                Karelian Proper  | 
|                Lude             | 
|                Olonets Karelian | 
|            Livonian             | 
|            Veps                 | 
|            Votic                | 
|        Hungarian                | 
|        Khanty                   | 
|        Mansi                    | 
|        Mari                     | 
|        Mordvinic                | 
|            Erzya                | 
|            Moksha               | 
|        Permic                   | 
|            Komi                 | 
|            Komi-Permyak         | 
|            Udmurt               | 
|        Sami                     | 
|            Eastern Sami         | 
|                Akkala Sami      | 
|                Inari Sami       | 
|                Kemi Sami        | 
|                Kildin Sami      | 
|                Skolt Sami       | 
|                Ter Sami         | 
|            Western Sami         | 
|                Lule Sami        | 
|                Northern Sami    | 
|                Pite Sami        | 
|                Southern Sami    | 
|                Umi Sami         | 
+---------------------------------+
39 rows in set (0.00 sec)

:D

Flare answered 6/4, 2011 at 3:17 Comment(3)
Recursive stored procedure...very nice.Hokum
I tried this but "rgt" column didn't get populated, lft values are ok but rgt is null for all rowsMcclurg
I just tried this by copy/pasting on mysql 5.1.63-0+squeeze1 on debian, and it worked okay. Did your original adjacency list table get populated correctly?Flare

© 2022 - 2024 — McMap. All rights reserved.