I have a self-referential table User:
id | follower
------------|------------
1 (adam) | 2 (bob)
1 (adam) | 3 (charlie)
2 (bob) | 1 (adam)
2 (bob) | 3 (charlie)
Note that there are circular references.
I want to get all the followers, and followers of the followers, and so on of a user such that all the followers are presented in a flattened list, with their respective depths
For Adam:
id | follower | depth
---|-------------|-------
1 | 1 (bob) | 0
2 | 3 (charlie) | 0
3 | 1 (adam) | 1 (bob -> adam)
4 | 3 (charlie) | 1 (bob -> charlie)
Problem
I want to avoid the rows 3 and 4, which represents two problems:
adam -> bob -> adam
because it’s circular.adam -> bob -> charlie
because charlie has already appeared before.
I’m able to solve problem #1 by using the following query by keeping a path
column of visited id
s in a branch
WITH RECURSIVE cte AS (
SELECT id, follower, 0 as depth, ARRAY[id] AS path
FROM user
UNION ALL
SELECT id, follower, depth + 1, id || path
FROM user
JOIN cte ON user.id = cte.follower
WHERE NOT path @> Array[user.id]
)
SELECT * from cte
But it doesn’t resolve problem #2.
It gives the following result:
follower | depth | path
------------|-------|-----
2 (bob) | 0 | {2}
3 (charlie) | 0 | {3}
3 (charlie) | 1 | {2, 3}
It still has problem #2 (duplicate charlie
entry) because path
column only keeps a list of id
s in a specific branch.
How do I fix problem #2?
Possible solution
I can solve it in my code (Node.JS) by keeping a global cache (path
equivalent).
const list = {}; /* <-- GLOBAL cache */
function recurse(user, depth = 0) {
for(const { id, followers } of user.followers) {
if (!(id in list)) {
list[id] = {id, depth}
recurse({ followers }, depth + 1);
}
}
}
Whereas, as far as I can tell, the above SQL query is the equivalent of something like:
function recursive() {
const list = {}; /* <-- LOCAL cache */
for(const {id} of followers)
if (!(id in list))
How to I replicate my solution in code using a global cache in SQL?
Or any other way I can achieve the desired result?
I’m using Node.JS and PostgreSQL
Advertisement
Answer
If I understand correctly, you want to select only one row per follower after the recursive search:
WITH RECURSIVE cte AS (
SELECT id, follower, 0 as depth, ARRAY[id] AS path
FROM user
UNION ALL
SELECT id, follower, depth + 1, id || path
FROM user
JOIN cte ON user.id = cte.follower
WHERE NOT path @> Array[user.id]
)
SELECT DISTINCT ON (follower) *
FROM cte
ORDER BY follower, depth;