I have this table named votes
, this table might consist of duplicate records with these columns business_id
and user_id
. I created a view that will display the count of votes per business but unfortunately the duplicates also has been counted which should not be. Below is my view code
( SELECT `v`.`business_id` AS `business_id`, COUNT(`v`.`vote_id`) AS `num_votes` FROM `connectn_top100`.`votes` `v` WHERE (`v`.`year` = 2019) GROUP BY `v`.`business_id` )
The actual data would be
------------------------------------ id | business_id | user_id | vote_id ------------------------------------ 1 | 12 | 12 | 1 ------------------------------------ 2 | 12 | 12 | 1 ------------------------------------ 3 | 12 | 12 | 1 ------------------------------------ 4 | 13 | 15 | 1 ------------------------------------ 5 | 13 | 15 | 1 ------------------------------------ 6 | 12 | 16 | 1 ------------------------------------ 7 | 23 | 16 | 1 ------------------------------------
so the result should be
------------------------------------ id | business_id | user_id | vote_id ------------------------------------ 1 | 12 | 12 | 1 ------------------------------------ 4 | 13 | 15 | 1 ------------------------------------ 6 | 12 | 16 | 1 ------------------------------------ 7 | 23 | 16 | 1 ------------------------------------
so it should be user_id must have unique business_id and business_id must have unique user_id
tried adding another GROUP BY but seems an error.
Advertisement
Answer
If you want to count each user_id
just once per business_id
, you can use count(distinct ...)
:
SELECT `v`.`business_id` AS `business_id`, COUNT(DISTINCT `v`.`user_id`) AS `num_votes` FROM `connectn_top100`.`votes` `v` WHERE `v`.`year` = 2019 GROUP BY `v`.`business_id`
Edit: from your sample data, it looks like you want:
SELECT `v`.`business_id` AS `business_id`, `v`.`user_id`, COUNT(DISTINCT `v`.`vote_id`) AS `num_votes` FROM `connectn_top100`.`votes` `v` WHERE `v`.`year` = 2019 GROUP BY `v`.`business_id`, `v`.`user_id`