Skip to content
Advertisement

How to find the degree of connection in a social network using recursion

Imagine, you are looking at a social network graph of millions of users. Imagine Facebook users who are in different Facebook Groups. Let me give the following example:

We start with Jeff. Jeff has a degree of connection with himself of 0.

Rohit is in the same Facebook group as Jeff, so his degree of connection with Jeff is 1.

Linda is in a facebook group with Rohit, but not in one with Jeff. Her degree connection with Jeff is, therefore 2.

This same phenomenon goes on and on. Now we want to create a query in SQL that can find all users in the user table that have a degree of connection with Jeff of 3 or less. We have the following 2 tables:

user

name person_id
Jeff 1
Rohit 2
Linda 3
Sid 4
Jin 5

group_in

group_id person_id
1 1
1 2
2 2
2 3
3 3
3 4
4 4
4 5

What query could find and return all the person_ids and degrees_of_connection for all users with a degrees_of_connection <= 3 with Jeff using just these two tables? The output should show that Jeff, Rohit, Linda, and Sid are all within 3 degrees of connectivity. Whereas Jin would not be included in the results as he is 4 degrees of connection away

Advertisement

Answer

You may try the following which uses a recursive cte to find the degree of connection between users. The final projection uses joins to retrieve the respective user names.

WITH user_connections(person_id_1,person_id_2,group_id,no_con) AS (
    SELECT 
        p1.person_id, 
        p2.person_id , 
        p2.group_id ,
        case when p2.person_id is null then 0 else 1 end as no_con
    from group_in p1
    left join group_in p2 on p1.group_id = p2.group_id and p1.person_id < p2.person_id
    UNION ALL
    SELECT 
         p1.person_id_1, 
         p3.person_id, 
         p3.group_id ,
         1+no_con 
    from user_connections p1
    inner join group_in p2 on (
                              p2.group_id <> p1.group_id and
                              p2.person_id = p1.person_id_2 and
                              p1.person_id_2 is not null) 
    inner join group_in p3 on p3.group_id = p2.group_id and 
                              p3.person_id <> p2.person_id
    where no_con < 3
)
SELECT 
    con_name.person_id,
    con_name.name,
    uc.no_con as degrees_of_connection
FROM 
    users u 
INNER JOIN user_connections uc ON u.person_id = uc.person_id_1
INNER JOIN users con_name on con_name.person_id = uc.person_id_2
WHERE u.name = 'Jeff';
person_id name degrees_of_connection
2 Rohit 1
3 Linda 2
4 Sid 3

Working Demo Fiddle

Let me know if this works for you.

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement