Skip to content
Advertisement

Select and skip rows by date range with checking difference between them

I’m new in Teradata and i have a small sql problem, similar to this one below :

source table A:

a|b|c|  dt      |dt_f
-------------------------
1|1|5|30/01/2020|21/02/2020
1|1|2|28/02/2020|19/03/2020
1|1|2|20/03/2020|17/04/2020
1|1|2|19/04/2020|05/05/2020
1|1|2|30/06/2020|24/07/2020
1|1|2|27/07/2020|31/12/2999

output wanted:

a|b|c|    dt    |dt_f
------------------------------
1|1|5|30/01/2020|**27/02/2020**
1|1|2|28/02/2020|**19/05/2020**
1|1|2|30/06/2020|**31/12/2999**

Explanation:

1 –> if c is different (between current row and the next one) so dt_f of the current row = dt of the next row – 1 day , the two rows are selected

2–> if months_between(dt,dt) > 1 ( in the exmple between the row number 4 and the row number 5) so dt of the first row selected with same id a,b and c will be df ( of row number 4) + 1 month and the row 5 will be selected with dt_f = 31/12/2999.

I tried a lot using with recursive but i didn’t get the real result, but i believe it can be solved with it.

thank’s for your replies 🙂

Answer

If this was just about returning overlapping rows, it would be really simple using Teradata’s NORMALIZE extension:

CREATE VOLATILE TABLE vt 
(a INT, b INT, c INT, dt DATE, dt_f DATE)
ON COMMIT PRESERVE ROWS;

INSERT INTO vt(1, 1, 5, DATE '2020-01-30', DATE '2020-02-21');
INSERT INTO vt(1, 1, 2, DATE '2020-02-28', DATE '2020-03-19');
INSERT INTO vt(1, 1, 2, DATE '2020-03-20', DATE '2020-04-17');
INSERT INTO vt(1, 1, 2, DATE '2020-04-19', DATE '2020-05-05');
INSERT INTO vt(1, 1, 2, DATE '2020-06-30', DATE '2020-07-24');
INSERT INTO vt(1, 1, 2, DATE '2020-07-27', DATE '2999-12-31'); 

WITH cte AS 
 ( -- adjusting for gaps > 1 month
   SELECT NORMALIZE a,b,c
     ,PERIOD(dt, Add_Months(dt_f,1)) AS pd
   FROM vt
 )
SELECT a,b,c
  ,Begin(pd) AS dt
  ,Add_Months(End(pd),-1) AS dt_f
FROM cte
;

But your logic to adjust the end date needs Analytic Functions. This is probably the simplest query to get those overlapping periods plus additional columns, modified to match your logic:

WITH cte AS
 ( -- returns both start/end of an island, but in seperate rows
   SELECT 
      a,b,c
     ,dt        -- start of current island
     ,Max(dt_f) -- end of previous island (used for finding gaps)
      Over (PARTITION BY a,b,c
            ORDER BY dt
            ROWS BETWEEN Unbounded Preceding
                     AND 1 Preceding) AS prev_max_end
     ,Lag(dt)   -- to adjust end date in case of gap > 1 month
      Over (PARTITION BY a,b,c
            ORDER BY dt) AS prev_dt
   FROM vt
   QUALIFY Add_Months(prev_max_end,1) < dt -- gap found
        OR prev_max_end IS NULL            -- first row
 )
SELECT
   a,b,c
  ,dt -- start of current island

   -- next row has end of current island
  ,CASE
     WHEN Lead(c ) -- change in c column?
          Over (PARTITION BY a,b
                ORDER BY dt) <> c 
     THEN Lead(dt) -- start of next island - 1
          Over (PARTITION BY a,b
                ORDER BY dt) -1
     ELSE -- 
          Lead(Add_Months(prev_dt,1),1,DATE '2999-12-31')
          Over (PARTITION BY a,b
                ORDER BY dt)
   END
FROM cte
;