Skip to content
Advertisement

Find users with most number of common pages they liked

I am trying to find pairs of users who liked the same pages and list the ones who have the most common page likes at the top.

For simplicity I am considering the following table schema

Likes (LikeID,  UserID)

LikeDetail (LikeDetailID,   LikeID, PageID)

I am trying to find pairs of users with most number of common page likes ordered descending. E.g User1 and User2 have liked 3 pages in common.

I would to have the resulting set of the query to be

UserID1 UserID2 NoOfCommonLikes
2       3       10
4       3       8
1       5       4

I am guessing it would need aggregation, join and aliases however I needed to rename a table twice using AS which did not work for me.

Any tip would be appreciated in MySQL, or SQL Server.

Advertisement

Answer

In SQL Server and MySQL 8+, you can use a CTE which JOINs the Likes and LikeDetail table, and then self-JOIN that where PageID is the same but UserID is not, and then grouping on the two userID values:

WITH CTE AS 
(SELECT l.UserId, d.PageID
 FROM Likes l
 JOIN LikeDetail d ON d.LikeID = l.likeID)
SELECT l1.UserId AS UserID1, l2.UserID AS UserID2, COUNT(*) AS NoOfCommonLikes
FROM CTE l1
JOIN CTE l2 ON  l2.PageID = l1.PageID AND l2.UserID < l1.UserID
GROUP BY l1.UserID, l2.UserID
ORDER BY COUNT(*) DESC

In versions of MySQL prior to 8.0, you need to repeat the CTE defintion twice in a JOIN to achieve the same result:

SELECT l1.UserId AS UserID1, l2.UserID AS UserID2, COUNT(*) AS NoOfCommonLikes
FROM (SELECT l.UserId, d.PageID
      FROM Likes l
      JOIN LikeDetail d ON d.LikeID = l.likeID) l1
JOIN (SELECT l.UserId, d.PageID
      FROM Likes l
      JOIN LikeDetail d ON d.LikeID = l.likeID) l2 ON l2.PageID = l1.PageID AND l2.UserID < l1.UserID
GROUP BY l1.UserID, l2.UserID
ORDER BY COUNT(*) DESC

Note that we use < in the UserID comparison rather than != to avoid getting duplicate rows (e.g. for (UserID1, UserID2) = (1, 2) and (UserID1, UserID2) = (2, 1).

I’ve made a small demo on dbfiddle which demonstrate the queries.

User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement