Skip to content
Advertisement

Calculate distinct values per day that resets each month (Big Query)

If I have table like this that contains unique id of a store and the date when they make an order

store_id order_date
A 01-Jun-21
A 02-Jun-21
B 02-Jun-21
C 02-Jun-21
A 03-Jun-21
A 04-Jun-21
D 04-Jun-21
A 01-Jul-21
B 01-Jul-21

I need to aggregate it by day, but each day should only consist of store_id that never appeared before and it resets each month.

The end result should be like this:

order_date count store
01-Jun-21 1
02-Jun-21 2
03-Jun-21 0
04-Jun-21 1
01-Jul-21 2
  • At 1 Jun, only store A make order so it’s 1
  • At 2 Jun, store A, B, C make order, but because store A has make order before, it will be excluded
  • At Jun 3, should be 0, because only order come from store A that has make order before
  • At Jul 1, it resets, so store A and B are counted again

Advertisement

Answer

If you want to count “unique” stores cumulatively within a month, I would recommend using row_number() to determine the first time a store appears:

select order_date, countif(seqnum = 1)
from (select t.*,
             row_number() over (partition by store_id, date_trunc(order_date, month)
                                order by order_date
                               ) as seqnum
      from t
     ) t
group by order_date;
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement