Imagine, you are looking at a social network graph of millions of users. Imagine Facebook users who are in different Facebook Groups. Let me give the following example:
We start with Jeff. Jeff has a degree of connection with himself of 0.
Rohit is in the same Facebook group as Jeff, so his degree of connection with Jeff is 1.
Linda is in a facebook group with Rohit, but not in one with Jeff. Her degree connection with Jeff is, therefore 2.
This same phenomenon goes on and on. Now we want to create a query in SQL that can find all users in the user table that have a degree of connection with Jeff of 3 or less. We have the following 2 tables:
user
| name | person_id |
|---|---|
| Jeff | 1 |
| Rohit | 2 |
| Linda | 3 |
| Sid | 4 |
| Jin | 5 |
group_in
| group_id | person_id |
|---|---|
| 1 | 1 |
| 1 | 2 |
| 2 | 2 |
| 2 | 3 |
| 3 | 3 |
| 3 | 4 |
| 4 | 4 |
| 4 | 5 |
What query could find and return all the person_ids and degrees_of_connection for all users with a degrees_of_connection <= 3 with Jeff using just these two tables? The output should show that Jeff, Rohit, Linda, and Sid are all within 3 degrees of connectivity. Whereas Jin would not be included in the results as he is 4 degrees of connection away
Advertisement
Answer
You may try the following which uses a recursive cte to find the degree of connection between users. The final projection uses joins to retrieve the respective user names.
WITH user_connections(person_id_1,person_id_2,group_id,no_con) AS (
SELECT
p1.person_id,
p2.person_id ,
p2.group_id ,
case when p2.person_id is null then 0 else 1 end as no_con
from group_in p1
left join group_in p2 on p1.group_id = p2.group_id and p1.person_id < p2.person_id
UNION ALL
SELECT
p1.person_id_1,
p3.person_id,
p3.group_id ,
1+no_con
from user_connections p1
inner join group_in p2 on (
p2.group_id <> p1.group_id and
p2.person_id = p1.person_id_2 and
p1.person_id_2 is not null)
inner join group_in p3 on p3.group_id = p2.group_id and
p3.person_id <> p2.person_id
where no_con < 3
)
SELECT
con_name.person_id,
con_name.name,
uc.no_con as degrees_of_connection
FROM
users u
INNER JOIN user_connections uc ON u.person_id = uc.person_id_1
INNER JOIN users con_name on con_name.person_id = uc.person_id_2
WHERE u.name = 'Jeff';
| person_id | name | degrees_of_connection |
|---|---|---|
| 2 | Rohit | 1 |
| 3 | Linda | 2 |
| 4 | Sid | 3 |
Let me know if this works for you.