I am currently trying to combine Left join
with window
function (e.g. partition by class_id order by date_to desc
) to solve the following problem in SQL. It is quite a difficult problem due to a bunch of tricky constraints that come into play, and it seems to me I need to have a way to choose which rows from the partitioned sub-tables (after applying windows
function) to go with.
Problem. Given two tables A and B having two common fields class_id
and prop_id
, our goal is that for given parameters year
and month
(e.g. 2022-05), the query would include all values in the column class_id
from Table A, change the corresponding value prop_id
in the same table when certain conditions are satisfied based on the values of date_to
and date_from
columns of Table B. The conditions are: if the same value prop_id
in Table B has date_from
and date_to
between the given parameters year
and month
, then we set the prop_id
in Table A to Null
for that particular class_id
. Note that a specific class_id
in Table B might have multiple prop_id,
so we need to determine the exact pair of date_from
and date_to
that contains the given parameters year
and month
.
Sample Inputs
Table A class_id prop_id 12 aa_13 13 ab_21 22 ac_11 53 bb_32 48 ac_57 Table B class_id prop_id date_from date_to 12 aa_13 2022-02-15 2022-12-10 12 aa_31 2021-09-30 2022-02-12 13 ac_12 2021-05-18 2022-02-05 22 ac_11 2022-05-12 2022-08-25 22 ac_12 2022-01-05 2022-04-23 22 ac_13 2021-08-18 2021-11-16 53 bb_32 2022-02-06 2022-03-19 53 bb_31 2021-05-08 2022-02-05 48 ac_57 2022-02-03 2022-05-07
Now, if given year = 2022
and month = 5
, then the expected output is
Table A class_id prop_id (modified properly) 12 aa_13 13 NULL -- 2022-05 > date_to = 2022-02-05 22 NULL -- 2022-05 could be between 2022-04-23 and 2022-05-11, which is outside the range between 2022-05-12 and 2022-08-25 => NULL is chosen here 53 NULL -- Same reasoning as for class_id = 22 48 NULL - 2022-05 could be beyond 2022-05-07 => set it to NULL.
Question. How can I write a query to achieve the above expected output?
Advertisement
Answer
You didn’t provide a tag for the DB engine, so first is for Postgres, and then Snowflake as per your comment.
Postgres:
create table "public"."a" ( class_id int4, prop_id text ); create table "public"."b" ( class_id int, prop_id varchar, date_from date, date_to date ); insert into public.a(class_id,prop_id) values (12,'aa_13'), (13,'ac_21'), (22,'ac_11'), (53,'bb_32'), (48,'ac_57'); insert into b values (12,'aa_13','2022-02-15','2022-12-10'), (12,'aa_31','2021-09-30','2022-02-12'), (13,'ac_12','2021-05-18','2022-02-05'), (22,'ac_11','2022-05-12','2022-08-25'), (22,'ac_12','2022-01-05','2022-04-23'), (22,'ac_13','2021-08-18','2021-11-16'), (53,'bb_32','2022-02-06','2022-03-19'), (53,'bb_31','2021-05-08','2022-02-05'), (48,'ac_57','2022-02-03','2022-05-07'); select a.class_id, a.prop_id, case when b.date_from <= make_date(2022,5,1) and b.date_to >= make_date(2022,5,1) + '1 month - 1 day'::interval then a.prop_id else null end modified_prop_id from a left join b on a.prop_id = b.prop_id;class_id | prop_id | modified_prop_id -------: | :------ | :--------------- 12 | aa_13 | aa_13 22 | ac_11 | null 13 | ac_21 | null 48 | ac_57 | null 53 | bb_32 | null
db<>fiddle here
Snowflake:
create or replace table a ( class_id number(38,0), prop_id varchar(100) ); create or replace table b ( class_id number(38,0), prop_id varchar(100), date_from date, date_to date ); insert into public.a(class_id,prop_id) values (12,'aa_13'), (13,'ac_21'), (22,'ac_11'), (53,'bb_32'), (48,'ac_57'); insert into b values (12,'aa_13','2022-02-15','2022-12-10'), (12,'aa_31','2021-09-30','2022-02-12'), (13,'ac_12','2021-05-18','2022-02-05'), (22,'ac_11','2022-05-12','2022-08-25'), (22,'ac_12','2022-01-05','2022-04-23'), (22,'ac_13','2021-08-18','2021-11-16'), (53,'bb_32','2022-02-06','2022-03-19'), (53,'bb_31','2021-05-08','2022-02-05'), (48,'ac_57','2022-02-03','2022-05-07'); select a.class_id, a.prop_id, case when b.date_from <= date_from_parts(2022,5,1) and b.date_to >= last_day(date_from_parts(2022,5,1)) then a.prop_id else null end modified_prop_id from a left join b on a.prop_id = b.prop_id;