I have an events table where I currently have a count of users grouped by the date. I am wanting to make two different counts based on the number of times a user had a specific event (we’ll call it event_a) The first count will count the user if event_a appeared exactly 1 time for the user, the 2nd count will count the user if the event_a appeared more than once for the user.
I believe to achieve this I need a CASE WHEN, but I am unsure how to correctly write the code to give me the output I need. Currently this is what I have:
SELECT calendar_date, COUNT(distinct(users)) AS n_users, COUNT(distinct CASE WHEN eventName = 'event_a' THEN users END) AS n_users_event_a, COUNT(distinct CASE WHEN 'event_a' = 1 THEN userID END) AS bucket_a COUNT(distinct CASE WHEN 'event_a' > 1 THEN userID END) AS bucket_b FROM e_table WHERE calendar_date >= '2019-10-01' AND event_name IN ('event_a', 'event_b', 'event_c') GROUP BY 1 ORDER BY 1;
When I run this code I get the error message:
Query 1 ERROR: ERROR: Invalid input syntax for integer: “event_a”
Looking at my code and the error, it makes sense why I am seeing this error because I am trying to use a mathematical operation on a string. I believe I need to have some type of count within my count syntax, but I am unsure how to approach that. Am I on the right track, or is there a different way to approach this??
Advertisement
Answer
Use two levels of aggregation, one at the date/user level and the other just for the date:
SELECT calendar_date, COUNT(*) AS n_users, COUNT(*) FILTER (WHERE cnt_a > 0) AS n_users_event_a, COUNT(*) FILTER (WHERE cnt_a = 1) AS n_users_event_a_1, COUNT(*) FILTER (WHERE cnt_a > 1) AS n_users_event_a_2pl FROM (SELECT calendar_date, user, eventName, count(*) as cnt, COUNT(*) FILTER (WHERE eventName = 'event_a') as cnt_a FROM e_table WHERE calendar_date >= '2019-10-01' AND event_name IN ('event_a', 'event_b', 'event_c') GROUP BY calendar_date, user ) e GROUP BY 1 ORDER BY 1;
EDIT:
You can replace the FILTER
with:
SELECT calendar_date, COUNT(*) AS n_users, SUM( (cnt_a > 0)::int ) AS n_users_event_a, SUM( (cnt_a = 1)::int ) AS n_users_event_a_1, SUM( (cnt_a > 1)::int ) AS n_users_event_a_2pl