I have a dataset in bigquery which contains order_date: DATE and customer_id.
order_date | CustomerID 2019-01-01 | 111 2019-02-01 | 112 2020-01-01 | 111 2020-02-01 | 113 2021-01-01 | 115 2021-02-01 | 119
I try to count distinct customer_id between the months of the previous year and the same months of the current year. For example, from 2019-01-01 to 2020-01-01, then from 2019-02-01 to 2020-02-01, and then who not bought in the same period of next year 2020-01-01 to 2021-01-01, then 2020-02-01 to 2021-02-01.
The output I am expect
order_date| count distinct CustomerID|who not buy in the next period 2020-01-01| 5191 |250 2020-02-01| 4859 |500 2020-03-01| 3567 |349 ..........| .... |......
and the next periods shouldn’t include the previous.
I tried the code below but it works in another way
with customers as ( select distinct date_trunc(date(order_date),month) as dates, CUSTOMER_WID from t where date(order_date) between '2018-01-01' and current_date()-1 ) select dates, customers_previous, customers_next_period from ( select dates, count(CUSTOMER_WID) as customers_previous, count(case when customer_wid_next is null then 1 end) as customers_next_period, from ( select prev.dates, prev.CUSTOMER_WID, next.dates as next_dates, next.CUSTOMER_WID as customer_wid_next from customers as prev left join customers as next on next.dates=date_add(prev.dates,interval 1 year) and prev.CUSTOMER_WID=next.CUSTOMER_WID ) as t2 group by dates ) order by 1,2
Thanks in advance.
Advertisement
Answer
You can generate the periods using unnest(generate_date_array())
. Then use join
s to bring in the customers from the previous 12 months and the next 12 months. Finally, aggregate and count the customers:
select period, count(distinct c_prev.customer_wid), count(distinct c_next.customer_wid) from unnest(generate_date_array(date '2020-01-01', date '2021-01-01', interval '1 month')) period join customers c_prev on c_prev.order_date <= period and c_prev.order_date > date_add(period, interval -12 month) left join customers c_next on c_next.customer_wid = c_prev.customer_wid and c_next.order_date > period and c_next.order_date <= date_add(period, interval 12 month) group by period;