I have a dataset in bigquery which contains order_date: DATE and customer_id.
order_date | CustomerID 2020-01-01 | 111 2020-02-01 | 112 2020-03-01 | 111 2021-01-01 | 113 2021-02-01 | 115 2021-03-01 | 119
How can I count distinct customer_id between the months of the previous year and the same months of the current year? For example, from 2020-01-01 to 2021-01-01, then from 2020-02-01 to 2021-01-01, and so on until the current date and should be grouped by the latest date. The output looks like
order_date| count distinct CustomerID 2021-01-01| 5191 2021-02-01| 4859 2021-03-01| 3567 ..........| ....
and the next periods shouldn’t include the previous.
Thanks in advance.
Advertisement
Answer
If you want just a count for each month you can expand the data and aggregate:
select mon, count(distinct customerid) from t cross join unnest(generate_date_array(t.order_date, date_add(t.order_date, interval 11 month), interval 1 month)) mon group by mon order by mon;