Skip to content
Advertisement

Is there a way to change this BigQuery self-join to use a window function?

Let’s say I have a BigQuery table “events” (in reality this is a slow sub-query) that stores the count of events per day, by event type. There are many types of events and most of them don’t occur on most days, so there is only a row for day/event type combinations with a non-zero count.

I have a query that returns the count for each event type and day and the count for that event from N days ago, which looks like this:

The query is slow. BigQuery best practices recommend using window functions instead of self-joins. Is there a way to do this here? I could use the LAG function if there was a row for each day, but there isn’t. Can I “pad” it somehow? (There isn’t a short list of possible event types. I could of course join to SELECT DISTINCT type FROM events, but that probably won’t be faster than the self-join.)

Advertisement

Answer

Below is for BigQuery Standard SQL

If t apply to sample data from your question – result is:

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement