Skip to content
Advertisement

Rolling 90 days active users in BigQuery, improving preformance (DAU/MAU/WAU)

I’m trying to get the number of unique events on a specific date, rolling 90/30/7 days back. I’ve got this working on a limited number of rows with the query bellow but for large data sets I get memory errors from the aggregated string which becomes massive.

I’m looking for a more effective way of achieving the same result.

Table looks something like this:

Format of the desired result:

My query looks like this:

Advertisement

Answer

Counting unique users requires a lot of resources, even more if you want results over a rolling window. For a scalable solution, look into approximate algorithms like HLL++:

For an exact count, this would work (but gets slower as the window gets larger):

enter image description here

The approximate solution produces results way faster (14s vs 366s, but then the results are approximate):

enter image description here


Updated query that gives correct results – removing rows with less than 90 days (works when no dates are missing):

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement