I have two tables:
- table A with sales products (test1, test2) with every day information
- table B with history of category changes of these products (product can change category once a year or every day, it depends on products)
And I want to have sales with current categories of products to this day (if category changed today i want to see it to this day and later, before next change)
Please help without many subqueries (tables are heavy)
TABLE A >> uid name date 1 test1 2019-09-01 2 test1 2019-09-02 3 test1 2019-09-03 4 test2 2019-09-01 5 test2 2019-09-02 6 test2 2019-09-03 TABLE B >> uid name date_change category 1 test1 2019-08-30 cat1 2 test1 2019-09-02 cat2 4 test1 2019-09-04 cat3 5 test2 2019-09-01 cat4 6 test2 2019-09-03 cat5 TABLE Result >> uid name date date_change category 1 test1 2019-09-01 2019-08-30 cat1 2 test1 2019-09-02 2019-09-02 cat2 3 test1 2019-09-03 2019-09-02 cat2 4 test2 2019-09-01 2019-09-01 cat4 5 test2 2019-09-02 2019-09-01 cat4 6 test2 2019-09-03 2019-09-03 cat5
Advertisement
Answer
Below is for BigQuery Standard SQL
#standardSQL SELECT a.name, `date`, date_change, category FROM `project.dataset.tableA` a LEFT JOIN( SELECT name, date_change, category, DATE_SUB(LEAD(date_change, 1, DATE '2100-01-01') OVER(PARTITION BY name ORDER BY date_change), INTERVAL 1 DAY) last_day FROM `project.dataset.tableB` ) b ON a.name = b.name AND a.date BETWEEN b.date_change AND b.last_day -- ORDER BY name, date_change
If to apply to sample data from your question – result is
Row name date date_change category 1 test1 2019-09-01 2019-08-30 cat1 2 test1 2019-09-02 2019-09-02 cat2 3 test1 2019-09-03 2019-09-02 cat2 4 test2 2019-09-01 2019-09-01 cat4 5 test2 2019-09-02 2019-09-01 cat4 6 test2 2019-09-03 2019-09-03 cat5