Skip to content
Advertisement

Count from 4 tables with join and null values

Tables:

 Posts           Comments           Likes            Users
ID | User_id     ID | Post_id      ID | Post_id     ID | Name
-------          --------          --------         --------
1  | 101          1 | 1             1 | 3          101 | 'Michael'
2  | 101          2 | 3             2 | 3          202 | 'Daniel'
3  | 303          3 | 3             3 | 3          303 | 'Scott'

What I am expecting to query is this:

ID | Total_comments | Total_likes | Name
1  |       1        |     0       | 'Michael'
2  |       0        |     0       | 'Michael'
3  |       2        |     3       | 'Scott'

However this is what I get:

ID | Total_comments | Total_likes | Name
1  |       1        |     0       | 'Michael'
2  |       0        |     0       | 'Michael'
3  |       6        |     6       | 'Scott'

With this query:

SELECT Posts.ID,
       COUNT(Comments.Post_id) as Total_comments, 
       COUNT(Likes.Post_id) as Total_likes,
       Users.Name
       FROM Posts INNER JOIN Users 
       ON Users.ID = Posts.User_id 
       LEFT JOIN Comments 
       ON Comments.Post_id = Posts.ID 
       LEFT JOIN Likes 
       ON Likes.Post_id = Posts.ID 
       GROUP BY Posts.ID

I wonder, in what way the query above could be modified to return the expected results? What would be a good approach to achieve this?

Advertisement

Answer

You are joining along two dimensions, so you are getting a Cartesian products. That is simply what happens.

A hacky solution — which works quite well if there are not too many comments and likes for a given post — is to use COUNT(DISTINCT):

SELECT Posts.ID,
       COUNT(DISTINCT Comments.Post_id) as Total_comments, 
       COUNT(DISTINCT Likes.Post_id) as Total_likes,
       Users.Name

The most efficient solution, though, is probably correlated subqueries:

SELECT p.ID,
       (SELECT COUNT(*) FROM Comments c WHERE c.Post_Id = p.ID) as Total_comments, 
       (SELECT COUNT(*) FROM Likes l WHERE l..Post_id = p.ID) as Total_likes,
       u.Name
FROM Posts p INNER JOIN
     Users u
     ON u.ID = p.User_id ;

This is faster because it avoids the outer aggregation. However, it requires indexes on comments(post_id) and likes(post_id).

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement