Skip to content
Advertisement

Google BigQuery SQL: How to fill in gaps in a table with dates?

Data Table:

Dates Customer Subscription
20/02/2020 A free
21/02/2020 A free
22/02/2020 A free
23/02/2020 B free
23/03/2020 A full
01/03/2020 B full
01/03/2020 A full
02/03/2020 A full

Need to fill gaps in dates by the value in the previous date

Output:

Dates Customer Last Subscription
20/02/2020 A free
21/02/2020 A free
22/02/2020 A free
23/03/2020 A full
23/03/2020 B free
24/02/2020 A full
24/02/2020 B free
25/02/2020 A full
25/02/2020 B free
26/02/2020 A full
26/02/2020 B free
27/02/2020 A full
27/02/2020 B free
28/02/2020 A full
28/02/2020 B free
01/03/2020 A full
01/03/2020 B full
02/03/2020 A full
02/03/2020 B full

I found a similar solution Duplicate groups of records to fill multiple date gaps in Google BigQuery, but it is not suitable because in my example each Customer has a different start date.

Advertisement

Answer

Consider below

with temp as (
  select customer, dates from (    
    select customer, min(dates) min_date, max(dates) max_date
    from `project.dataset.table`
    group by customer
  ), unnest(generate_date_array(min_date, max_date)) dates
)
select customer, dates, 
  first_value(subscription ignore nulls) over win as subscription
from temp a
left join `project.dataset.table` b
using(customer, dates)
window win as (partition by customer order by dates desc rows between current row and unbounded following)
# order by dates, customer          

If to apply to sample data in y our question – output is

enter image description here

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement