Imagine, you are looking at a social network graph of millions of users. Imagine Facebook users who are in different Facebook Groups. Let me give the following example:
We start with Jeff
. Jeff
has a degree of connection with himself of 0
.
Rohit
is in the same Facebook group as Jeff
, so his degree of connection with Jeff
is 1
.
Linda
is in a facebook group with Rohit
, but not in one with Jeff
. Her degree connection with Jeff
is, therefore 2
.
This same phenomenon goes on and on. Now we want to create a query in SQL that can find all users in the user
table that have a degree of connection with Jeff
of 3 or less
. We have the following 2 tables:
user
name | person_id |
---|---|
Jeff | 1 |
Rohit | 2 |
Linda | 3 |
Sid | 4 |
Jin | 5 |
group_in
group_id | person_id |
---|---|
1 | 1 |
1 | 2 |
2 | 2 |
2 | 3 |
3 | 3 |
3 | 4 |
4 | 4 |
4 | 5 |
What query could find and return all the person_id
s and degrees_of_connection
for all users
with a degrees_of_connection <= 3
with Jeff
using just these two tables? The output should show that Jeff
, Rohit
, Linda
, and Sid
are all within 3 degrees of connectivity. Whereas Jin
would not be included in the results as he is 4 degrees of connection away
Advertisement
Answer
You may try the following which uses a recursive cte to find the degree of connection between users. The final projection uses joins to retrieve the respective user names.
WITH user_connections(person_id_1,person_id_2,group_id,no_con) AS ( SELECT p1.person_id, p2.person_id , p2.group_id , case when p2.person_id is null then 0 else 1 end as no_con from group_in p1 left join group_in p2 on p1.group_id = p2.group_id and p1.person_id < p2.person_id UNION ALL SELECT p1.person_id_1, p3.person_id, p3.group_id , 1+no_con from user_connections p1 inner join group_in p2 on ( p2.group_id <> p1.group_id and p2.person_id = p1.person_id_2 and p1.person_id_2 is not null) inner join group_in p3 on p3.group_id = p2.group_id and p3.person_id <> p2.person_id where no_con < 3 ) SELECT con_name.person_id, con_name.name, uc.no_con as degrees_of_connection FROM users u INNER JOIN user_connections uc ON u.person_id = uc.person_id_1 INNER JOIN users con_name on con_name.person_id = uc.person_id_2 WHERE u.name = 'Jeff';
person_id | name | degrees_of_connection |
---|---|---|
2 | Rohit | 1 |
3 | Linda | 2 |
4 | Sid | 3 |
Let me know if this works for you.