I have two datasets:
dataset_a time_stamp user group value 2021-06-20 12:48:24.521 A video 1 2021-06-15 12:50:24.521 A video 1 2021-06-10 12:48:24.521 A video 1 dataset_b time_stamp user group label 2021-06-20 09:40:24.521 A video BA 2021-06-19 13:30:24.521 A video BB 2021-06-13 12:48:24.521 A video BC 2021-06-09 12:55:24.521 A video BD
I want to create a dataset where if dataset b is within 1 day of timestamp of dataset a by timestamp, user, and group then it is a match. Has anyone done something like this before where it is something like left join on dataset_b.timestamp between dataset_a.timestamp and date_add(dataset_a.timestamp,-1)
. I’d like to have the flexibility where in the future I can test out -7 days as well so it is easily modifiable.
Expected output below:
dataset_a time_stamp user group value timestamp_b label 2021-06-20 12:48:24.521 A video 0.5 2021-06-20 09:40:24.521 BA 2021-06-20 12:48:24.521 A video 0.5 2021-06-19 13:30:24.521 BB 2021-06-15 12:50:24.521 A video 1 NULL NULL 2021-06-10 12:48:24.521 A video 1 2021-06-09 12:55:24.521 BD
Advertisement
Answer
The JOIN
condition does not have to be only equality operator so:
SELECT * FROM dataset_a LEFT JOIN dataset_b ON dataset_b.user = dataset_a.user AND dataset_b.group = dataset_a.group AND dataset_b.time_stamp BETWEEN dataset_a.time_stamp - INTERVAL '1 day' AND dataset_a.time_stamp ;
is a valid join.