Skip to content
Advertisement

Modifying records in one column of a table conditioning on the date ranges in the other table

I am currently trying to combine Left join with window function (e.g. partition by class_id order by date_to desc) to solve the following problem in SQL. It is quite a difficult problem due to a bunch of tricky constraints that come into play, and it seems to me I need to have a way to choose which rows from the partitioned sub-tables (after applying windows function) to go with.

Problem. Given two tables A and B having two common fields class_id and prop_id, our goal is that for given parameters year and month (e.g. 2022-05), the query would include all values in the column class_id from Table A, change the corresponding value prop_id in the same table when certain conditions are satisfied based on the values of date_to and date_from columns of Table B. The conditions are: if the same value prop_id in Table B has date_from and date_to between the given parameters year and month, then we set the prop_id in Table A to Null for that particular class_id. Note that a specific class_id in Table B might have multiple prop_id, so we need to determine the exact pair of date_from and date_to that contains the given parameters year and month.

Sample Inputs

Table A  
         class_id   prop_id   
         12         aa_13         
         13         ab_21
         22         ac_11
         53         bb_32
         48         ac_57

Table B
         class_id   prop_id  date_from     date_to 
         12         aa_13    2022-02-15    2022-12-10
         12         aa_31    2021-09-30    2022-02-12  
         13         ac_12    2021-05-18    2022-02-05
         22         ac_11    2022-05-12    2022-08-25     
         22         ac_12    2022-01-05    2022-04-23
         22         ac_13    2021-08-18    2021-11-16
         53         bb_32    2022-02-06    2022-03-19
         53         bb_31    2021-05-08    2022-02-05
         48         ac_57    2022-02-03    2022-05-07

Now, if given year = 2022 and month = 5, then the expected output is

Table A  
         class_id   prop_id (modified properly)   
         12         aa_13         
         13         NULL -- 2022-05 > date_to = 2022-02-05
         22         NULL -- 2022-05 could be between 2022-04-23 and 2022-05-11, which is outside the range between 2022-05-12 and 2022-08-25 => NULL is chosen here
         53         NULL -- Same reasoning as for class_id = 22
         48         NULL - 2022-05 could be beyond 2022-05-07 => set it to NULL.

Question. How can I write a query to achieve the above expected output?

Advertisement

Answer

You didn’t provide a tag for the DB engine, so first is for Postgres, and then Snowflake as per your comment.

Postgres:

create table "public"."a" (
  class_id int4, 
  prop_id text
);

create table "public"."b" (
  class_id int, 
  prop_id varchar,
  date_from date,
  date_to date
);

insert into public.a(class_id,prop_id) values
(12,'aa_13'),
(13,'ac_21'),
(22,'ac_11'),
(53,'bb_32'),
(48,'ac_57');

insert into b values
(12,'aa_13','2022-02-15','2022-12-10'),
(12,'aa_31','2021-09-30','2022-02-12'),
(13,'ac_12','2021-05-18','2022-02-05'),
(22,'ac_11','2022-05-12','2022-08-25'),
(22,'ac_12','2022-01-05','2022-04-23'),
(22,'ac_13','2021-08-18','2021-11-16'),
(53,'bb_32','2022-02-06','2022-03-19'),
(53,'bb_31','2021-05-08','2022-02-05'),
(48,'ac_57','2022-02-03','2022-05-07');

select
  a.class_id, 
  a.prop_id,
  case 
      when b.date_from <= make_date(2022,5,1) and b.date_to >= make_date(2022,5,1) + '1 month - 1 day'::interval
      then a.prop_id 
      else null 
  end modified_prop_id
from a left join b on a.prop_id = b.prop_id;
class_id | prop_id | modified_prop_id
-------: | :------ | :---------------
      12 | aa_13   | aa_13           
      22 | ac_11   | null            
      13 | ac_21   | null            
      48 | ac_57   | null            
      53 | bb_32   | null            

db<>fiddle here

Snowflake:

create or replace table a (
    class_id number(38,0),
    prop_id varchar(100)
);

create or replace table b (
    class_id number(38,0),
    prop_id varchar(100),
    date_from date,
    date_to date
);

insert into public.a(class_id,prop_id) values
(12,'aa_13'),
(13,'ac_21'),
(22,'ac_11'),
(53,'bb_32'),
(48,'ac_57');

insert into b values
(12,'aa_13','2022-02-15','2022-12-10'),
(12,'aa_31','2021-09-30','2022-02-12'),
(13,'ac_12','2021-05-18','2022-02-05'),
(22,'ac_11','2022-05-12','2022-08-25'),
(22,'ac_12','2022-01-05','2022-04-23'),
(22,'ac_13','2021-08-18','2021-11-16'),
(53,'bb_32','2022-02-06','2022-03-19'),
(53,'bb_31','2021-05-08','2022-02-05'),
(48,'ac_57','2022-02-03','2022-05-07');
      
select
    a.class_id, 
    a.prop_id,
    case 
        when b.date_from <= date_from_parts(2022,5,1) and b.date_to >= last_day(date_from_parts(2022,5,1))
        then a.prop_id 
        else null 
    end modified_prop_id
from a left join b on a.prop_id = b.prop_id;
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement