I’m new in Teradata and i have a small sql problem, similar to this one below :
source table A:
a|b|c| dt |dt_f ------------------------- 1|1|5|30/01/2020|21/02/2020 1|1|2|28/02/2020|19/03/2020 1|1|2|20/03/2020|17/04/2020 1|1|2|19/04/2020|05/05/2020 1|1|2|30/06/2020|24/07/2020 1|1|2|27/07/2020|31/12/2999
output wanted:
a|b|c| dt |dt_f ------------------------------ 1|1|5|30/01/2020|**27/02/2020** 1|1|2|28/02/2020|**19/05/2020** 1|1|2|30/06/2020|**31/12/2999**
Explanation:
1 –> if c is different (between current row and the next one) so dt_f of the current row = dt of the next row – 1 day , the two rows are selected
2–> if months_between(dt,dt) > 1 ( in the exmple between the row number 4 and the row number 5) so dt of the first row selected with same id a,b and c will be df ( of row number 4) + 1 month and the row 5 will be selected with dt_f = 31/12/2999.
I tried a lot using with recursive but i didn’t get the real result, but i believe it can be solved with it.
thank’s for your replies 🙂
Advertisement
Answer
If this was just about returning overlapping rows, it would be really simple using Teradata’s NORMALIZE extension:
CREATE VOLATILE TABLE vt (a INT, b INT, c INT, dt DATE, dt_f DATE) ON COMMIT PRESERVE ROWS; INSERT INTO vt(1, 1, 5, DATE '2020-01-30', DATE '2020-02-21'); INSERT INTO vt(1, 1, 2, DATE '2020-02-28', DATE '2020-03-19'); INSERT INTO vt(1, 1, 2, DATE '2020-03-20', DATE '2020-04-17'); INSERT INTO vt(1, 1, 2, DATE '2020-04-19', DATE '2020-05-05'); INSERT INTO vt(1, 1, 2, DATE '2020-06-30', DATE '2020-07-24'); INSERT INTO vt(1, 1, 2, DATE '2020-07-27', DATE '2999-12-31'); WITH cte AS ( -- adjusting for gaps > 1 month SELECT NORMALIZE a,b,c ,PERIOD(dt, Add_Months(dt_f,1)) AS pd FROM vt ) SELECT a,b,c ,Begin(pd) AS dt ,Add_Months(End(pd),-1) AS dt_f FROM cte ;
But your logic to adjust the end date needs Analytic Functions. This is probably the simplest query to get those overlapping periods plus additional columns, modified to match your logic:
WITH cte AS ( -- returns both start/end of an island, but in seperate rows SELECT a,b,c ,dt -- start of current island ,Max(dt_f) -- end of previous island (used for finding gaps) Over (PARTITION BY a,b,c ORDER BY dt ROWS BETWEEN Unbounded Preceding AND 1 Preceding) AS prev_max_end ,Lag(dt) -- to adjust end date in case of gap > 1 month Over (PARTITION BY a,b,c ORDER BY dt) AS prev_dt FROM vt QUALIFY Add_Months(prev_max_end,1) < dt -- gap found OR prev_max_end IS NULL -- first row ) SELECT a,b,c ,dt -- start of current island -- next row has end of current island ,CASE WHEN Lead(c ) -- change in c column? Over (PARTITION BY a,b ORDER BY dt) <> c THEN Lead(dt) -- start of next island - 1 Over (PARTITION BY a,b ORDER BY dt) -1 ELSE -- Lead(Add_Months(prev_dt,1),1,DATE '2999-12-31') Over (PARTITION BY a,b ORDER BY dt) END FROM cte ;