Sorting a subtree in a closure table hierarchical-data structure
Asked Answered
I

1

21

I would like to ask you to help me with the problem with sorting of the hierarchical data structure stored as a closure table.

I wanted to use this structure to store my website menu. Everything works fine, but the problem is that I do not know how to sort the exact subtree in a custom order. At the moment the tree is sorted in the order in which the items were added to the database.

My structure is based on Bill Karwin's article about Closure Tables and some other posts.

Here is my MySQL database structure with some DEMO data:

--
-- Table `category`
--

CREATE TABLE IF NOT EXISTS `category` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(100) COLLATE utf8_czech_ci NOT NULL,
  `active` tinyint(1) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;


INSERT INTO `category` (`id`, `name`, `active`) VALUES
(1, 'Cat 1', 1),
(2, 'Cat 2', 1),
(3, 'Cat  1.1', 1),
(4, 'Cat  1.1.1', 1),
(5, 'Cat 2.1', 1),
(6, 'Cat 1.2', 1),
(7, 'Cat 1.1.2', 1);

--
-- Table `category_closure`
--

CREATE TABLE IF NOT EXISTS `category_closure` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `ancestor` int(11) DEFAULT NULL,
  `descendant` int(11) DEFAULT NULL,
  `depth` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `fk_category_closure_ancestor_category_id` (`ancestor`),
  KEY `fk_category_closure_descendant_category_id` (`descendant`)
) ENGINE=InnoDB;

INSERT INTO `category_closure` (`id`, `ancestor`, `descendant`, `depth`) VALUES
(1, 1, 1, 0),
(2, 2, 2, 0),
(3, 3, 3, 0),
(4, 1, 3, 1),
(5, 4, 4, 0),
(7, 3, 4, 1),
(8, 1, 4, 2),
(10, 6, 6, 0),
(11, 1, 6, 1),
(12, 7, 7, 0),
(13, 3, 7, 1),
(14, 1, 7, 2),
(16, 5, 5, 0),
(17, 2, 5, 1);

Here is my SELECT query for one tree:

SELECT c2.*, cc2.ancestor AS `_parent`
FROM category AS c1
JOIN category_closure AS cc1 ON (cc1.ancestor = c1.id)
JOIN category AS c2 ON (cc1.descendant = c2.id)
LEFT OUTER JOIN category_closure AS cc2 ON (cc2.descendant = c2.id AND cc2.depth = 1)
WHERE c1.id = __ROOT__ AND c1.active = 1
ORDER BY cc1.depth

For the DEMO instance with __ROOT_ = 1 that query gets:

id  name        active     _parent
1   Cat 1       1          NULL
3   Cat 1.1     1          1
6   Cat 1.2     1          1
4   Cat 1.1.1   1          3
7   Cat 1.1.2   1          3

But what if I for example need to change the order of Cat 1.1 and Cat 1.2 (according to name, or some custom order)?

I have seen some breadcrumbs solution (how to sort by breadcrumbs), but I do not know how to generate and change them.

Ibex answered 28/12, 2012 at 12:36 Comment(2)
+1 thanks for posting the sample DDL and data. – Dmz
Mind blown 🀯 when the legend himself appears in comment. – Boult
D
20

This question comes up frequently not only for Closure Table but also for other methods of storing hierarchical data. It's not easy in any of the designs.

The solution I've come up with for Closure Table involves one additional join. Every node in the tree joins to the chain of its ancestors, like a "breadcrumbs" type query. Then use GROUP_CONCAT() to collapse the breadcrumbs into a comma-separated string, sorting the id numbers by depth in the tree. Now you have a string by which you can sort.

SELECT c2.*, cc2.ancestor AS `_parent`,
  GROUP_CONCAT(breadcrumb.ancestor ORDER BY breadcrumb.depth DESC) AS breadcrumbs
FROM category AS c1
JOIN category_closure AS cc1 ON (cc1.ancestor = c1.id)
JOIN category AS c2 ON (cc1.descendant = c2.id)
LEFT OUTER JOIN category_closure AS cc2 ON (cc2.descendant = c2.id AND cc2.depth = 1)
JOIN category_closure AS breadcrumb ON (cc1.descendant = breadcrumb.descendant)
WHERE c1.id = 1/*__ROOT__*/ AND c1.active = 1
GROUP BY cc1.descendant
ORDER BY breadcrumbs;

+----+------------+--------+---------+-------------+
| id | name       | active | _parent | breadcrumbs |
+----+------------+--------+---------+-------------+
|  1 | Cat 1      |      1 |    NULL | 1           |
|  3 | Cat  1.1   |      1 |       1 | 1,3         |
|  4 | Cat  1.1.1 |      1 |       3 | 1,3,4       |
|  7 | Cat 1.1.2  |      1 |       3 | 1,3,7       |
|  6 | Cat 1.2    |      1 |       1 | 1,6         |
+----+------------+--------+---------+-------------+

Caveats:

  • The id values should have uniform length, because sorting "1,3" and "1,6" and "1,327" might not give the order you intend. But sorting "001,003" and "001,006" and "001,327" would. So you either need to start your id values at 1000000+, or else use ZEROFILL for ancestor and descendant in the category_closure table.
  • In this solution the display order depends on the numeric order of category id's. That numeric order of id values may not represent the order you want to display the tree. Or you may want the freedom to change the display order irrespective of the numeric id values. Or you may want the same category data to appear in more than one tree, each with different display order.
    If you need more freedom, you need to store the sort-order values separately from the id's, and the solution gets even more complex. But in most projects, it's acceptable to use a short-cut, giving the category id's double-duty as the tree display order.

Re your comment:

Yes, you could store "sibling sort order" as another column in the closure table, then use that value instead of ancestor to build the breadcrumbs string. But if you do that, you end up with a lot of data redundancy. That is, a given ancestor is stored on multiple rows, one for each path descending from it. So you have to store the same value for sibling sort order on all of those rows, which creates the risk of an anomaly.

The alternative would be to create another table, with only one row per distinct ancestor in the tree, and join to that table to get the sibling order.

CREATE TABLE category_closure_order (
  ancestor INT PRIMARY KEY,
  sibling_order SMALLINT UNSIGNED NOT NULL DEFAULT 1
);

SELECT c2.*, cc2.ancestor AS `_parent`,
  GROUP_CONCAT(o.sibling_order ORDER BY breadcrumb.depth DESC) AS breadcrumbs
FROM category AS c1
JOIN category_closure AS cc1 ON (cc1.ancestor = c1.id)
JOIN category AS c2 ON (cc1.descendant = c2.id)
LEFT OUTER JOIN category_closure AS cc2 ON (cc2.descendant = c2.id AND cc2.depth = 1)
JOIN category_closure AS breadcrumb ON (cc1.descendant = breadcrumb.descendant)
JOIN category_closure_order AS o ON breadcrumb.ancestor = o.ancestor
WHERE c1.id = 1/*__ROOT__*/ AND c1.active = 1
GROUP BY cc1.descendant
ORDER BY breadcrumbs;

+----+------------+--------+---------+-------------+
| id | name       | active | _parent | breadcrumbs |
+----+------------+--------+---------+-------------+
|  1 | Cat 1      |      1 |    NULL | 1           |
|  3 | Cat  1.1   |      1 |       1 | 1,1         |
|  4 | Cat  1.1.1 |      1 |       3 | 1,1,1       |
|  7 | Cat 1.1.2  |      1 |       3 | 1,1,2       |
|  6 | Cat 1.2    |      1 |       1 | 1,2         |
+----+------------+--------+---------+-------------+
Dmz answered 28/12, 2012 at 18:53 Comment(9)
Thanks a lot Mr. Karwin, so when I need more freedom on some level of the tree and create a custom order there (to set the order of the menu items which have the same parent), then I can add sth like "same level priority" column or that is a bad idea? – Ibex
Thank you very much Mr. Karwin, I have (hope that successfully) implemented the option with a third table. You are great. – Ibex
Curious, why not store the sort order in the Category table, that way no third table is required? When I've done sorting in the ancestral tree, I've always used a double, so that when a new node needs to be inserted between nodes 1 and 2, I can give it sortValue 1.5 – Revue
@Revue because you might need more than one tree containing some of the same categories in a different order. – Dmz
@BillKarwin Yes, I see. I guess in the work I've done I only ever needed to either have a custom sort, or a sort by data (such as alphabetic on name). I do believe I would prefer to have it in "just" the 2 tables of the closure method, unless of course I knew I would need to support multiple custom sorts. Food for thought anyway, thank you. – Revue
Why not just add a position column in category table? – Booboo
@yeahman, That's the same suggestion TheSmileyCoder made above. – Dmz
You saved my day @BillKarwin ! However, it doesn't work correctly for +9 categories. Because of the alphabetic order of the breadcrumbs, the 10th category will be placed between the 1st and the 2nd one. Would you please have a solution to this issue? – Mouser
@ScoobyDam, use LPAD() to pad the number with zeroes until the numbers have uniform length. See https://mcmap.net/q/235933/-adding-a-leading-zero-to-some-values-in-column-in-mysql – Dmz

© 2022 - 2024 β€” McMap. All rights reserved.