I have the following table:
Year | Week | Day_1 | Day_2 | Day_3 |
---|---|---|---|---|
2020 | 1 | Walk | Jump | Swim |
2020 | 3 | Walk | Swim | Walk |
2020 | 1 | Jump | Walk | Swim |
I want to group by YEAR, WEEK and Event (Walk, jump, Swim) and count the number of times each event occurs in Day_1, Day_2, Day_3. I.e.
Year | Week | Event | Count_Day_1 | Count_Day_2 | Count_Day_3 |
---|---|---|---|---|---|
2020 | 1 | Walk | 1 | 1 | 0 |
2020 | 3 | Walk | 1 | 0 | 1 |
2020 | 1 | Jump | 1 | 1 | 0 |
2020 | 3 | Jump | 0 | 0 | 0 |
2020 | 1 | Swim | 0 | 0 | 2 |
2020 | 3 | Swim | . 0 | 1 | 0 |
How can I do this efficiently?
Advertisement
Answer
In BigQuery, I would unpivot using arrays and then aggregate:
with t as ( select 2020 as year, 1 as week, 'Walk' as day_1, 'Jump' as day_2, 'Swim' as day_3 union all select 2020, 3, 'Walk', 'Swim', 'Walk' union all select 2020, 1, 'Jump', 'Walk', 'Swim' ) select t.year, t.week, s.event, countif(day = 1) as day_1, countif(day = 2) as day_2, countif(day = 3) as day_3 from t cross join unnest([struct(t.day_1 as event, 1 as day), struct(t.day_2 as event, 2 as day), struct(t.day_3 as event, 3 as day) ]) s group by t.year, t.week, s.event;