I have two table as below:
TABLE_1:
timestamp id 2020-11-24 01:05:00 UTC AA 2020-11-24 01:07:00 UTC AA 2020-11-24 01:07:00 UTC BB
TABLE_2:
timestamp id covered 2020-11-24 01:04:00 UTC AA true 2020-11-24 01:06:00 UTC AA false
I want to add a column to table_1 to show covered value from table_2. Although timestamps are not matching, I want to get the closest previous timestamp captured in table_2. For example AA was covered at 1:04 and uncovered at 1:06 so result table look like this:
timestamp id covered 2020-11-24 01:05:00 UTC AA true 2020-11-24 01:07:00 UTC AA false 2020-11-24 01:07:00 UTC BB null
Would you please help me to finish this query:
SELECT TABLE_1.timestamp, TABLE_1.id, TABLE_2.covered FROM TABLE_1 LEFT JOIN TABLE_2 ON TABLE_1.id = TABLE_2.id AND ?
Advertisement
Answer
BigQuery is not thrilled with non-equijoins. So an alternative is to use union all
and some trickery:
select t12.* from (select t12.*, last_value(covered ignore nulls) over (partition by id order by timestamp) as imputed_covered from ((select timestamp, id, null as covered from table_1 ) union all (select timestamp, id, covered from table_2 ) ) t12 ) t12 where covered is null;
This is combining the two tables. The last_value()
is then retrieving the most recent covered
value — which for table_1
rows will come from table_2
.
The final where
just filters the result set to only have table_1
rows.