I have the following schema for the tags table:
CREATE TABLE tags (
id integer NOT NULL,
name character varying(255) NOT NULL,
parent_id integer
);
I need to build a query to return the following structure (here represented as yaml for readability):
- name: Ciencia
parent_id:
id: 7
children:
- name: Química
parent_id: 7
id: 9
children: []
- name: Biología
parent_id: 7
id: 8
children:
- name: Botánica
parent_id: 8
id: 19
children: []
- name: Etología
parent_id: 8
id: 18
children: []
After some trial and error and looking for similar questions in SO, I've came up with this query:
WITH RECURSIVE tagtree AS (
SELECT tags.name, tags.parent_id, tags.id, json '[]' children
FROM tags
WHERE NOT EXISTS (SELECT 1 FROM tags tt WHERE tt.parent_id = tags.id)
UNION ALL
SELECT (tags).name, (tags).parent_id, (tags).id, array_to_json(array_agg(tagtree)) children FROM (
SELECT tags, tagtree
FROM tagtree
JOIN tags ON tagtree.parent_id = tags.id
) v
GROUP BY v.tags
)
SELECT array_to_json(array_agg(tagtree)) json
FROM tagtree
WHERE parent_id IS NULL
But it returns the following results when converted to yaml:
- name: Ciencia
parent_id:
id: 7
children:
- name: Química
parent_id: 7
id: 9
children: []
- name: Ciencia
parent_id:
id: 7
children:
- name: Biología
parent_id: 7
id: 8
children:
- name: Botánica
parent_id: 8
id: 19
children: []
- name: Etología
parent_id: 8
id: 18
children: []
The root node is duplicated. I could merge the results to the expected result in my app code but I feel I am close and it could be done al from PG.
Here's an example with SQL Fiddle: http://sqlfiddle.com/#!15/1846e/1/0
Expected output: https://gist.github.com/maca/e7002eb10f36fcdbc51b
Actual output: https://gist.github.com/maca/78e84fb7c05ff23f07f4