This is inspired by question Retrieve a list of lists in one SQL statement – I have come up with a solution, but I have doubts on its efficiency.
To restate the problem:
- we have 2 Tables:
Person
andParent
Person
contains basic data about each personParent
is a join table relating person with its parents- each Person can have multiple parents
- we want to receive each person data with list of all their ancestors – each ancestor in its own row
- if there are no ancestors, we have only one row for that person, with null parentId
Here is the data format:
Person table
Id <PK> Name
Parent table
Id<PK> ParentPersonId <FK into Person >
Person has rows with values PK
1, 'Jim' 2, 'John' 3, 'Anna' 4, 'Peter'
Parent has rows with values
1, 2 1, 3 2, 3 3, 4
So person 1 has ancestors 2, 3, 4
I want the output in the following form
id name parentPersonId -------------------------- 1 Jim 2 1 Jim 3 1 Jim 4 2 John 3 2 John 4 3 Anna 4 4 Peter (null)
My solution used recursive CTE query, but my fear is that it produces too many rows, as each subtree can be entered multiple times. I needed to filter out duplicates with distinct, and execution plan shows that, even with this simple data, sorting for distinct takes 50% of the time. Here is my query:
WITH cte_org AS ( SELECT per.id, per.name, par.parentPersonId FROM Person per LEFT JOIN Parent par ON per.id = par.id UNION ALL SELECT o.id, o.name, rec.parentPersonId FROM Parent rec INNER JOIN cte_org o ON o.parentPersonId = rec.id WHERE rec.parentPersonId IS NOT NULL ) SELECT DISTINCT * FROM cte_org ORDER BY id, parentPersonId;
http://sqlfiddle.com/#!18/d7d62/4
My questions:
- can I somehow prune the already visited branches, so that recursive-CTE does not produce duplicate rows, and final distinct is not necessary
- is recursive CTE a right approach to this problem?
Advertisement
Answer
On PostgreSQL you can acheive that by replacing UNION ALL
with UNION
.
So the query looks like that:
WITH RECURSIVE cte_org AS ( select per.id, per.name, par.parentPersonId from Person per left join Parent par on per.id = par.id UNION SELECT o.id, o.name, rec.parentPersonId FROM Parent rec INNER JOIN cte_org o ON o.parentPersonId = rec.id where rec.parentPersonId is not null ) SELECT * FROM cte_org ORDER BY id, parentPersonId;