Im trying to get a distinct count of user ids logs per day with every week as a partition for the distinct identification. e.g. if one user logs on Friday/Saturday of week 1, and on Monday/Friday of week 2, i want to have a column that counts 1 in Friday and 0 in Saturday week 1, and 1 in Monday and 0 in Friday week 2.
If i have the following data:
I expect this result:
Im working with Presto, and have tried with dense_rank() solutions but with no avail.
Advertisement
Answer
You can use window functions and aggregation:
select date, iso_week, sum(case when rn = 1 then 1 else 0 end) cnt from ( select t.*, row_number() over(partition by userid, iso_week order by date) min_date from mytable t ) t group by date, iso_week
Better yet, using the standard where
clause to aggregate functions, which Presto supports:
select date, iso_week, count(*) filter(where rn = 1) cnt from ( select t.*, row_number() over(partition by userid, iso_week order by date) min_date from mytable t ) t group by date, iso_week