There is a source table which loads the data full and monthly. The table looks like below example.
Source table:
pk | code_paym | code_terms | etl_id |
---|---|---|---|
1 | 2 | 3 | 2020-08-01 |
1 | 2 | 3 | 2020-09-01 |
1 | 2 | 4 | 2020-10-01 |
1 | 2 | 4 | 2020-11-01 |
1 | 2 | 4 | 2020-12-01 |
1 | 2 | 4 | 2021-01-01 |
1 | 2 | 3 | 2021-02-01 |
1 | 2 | 3 | 2021-03-01 |
1 | 2 | 3 | 2021-04-01 |
1 | 2 | 3 | 2021-05-01 |
I would like to create valid_from valid_to columns from the source table like below example.
Desired Output:
pk | code_paym | code_terms | valid_from | valid_to |
---|---|---|---|---|
1 | 2 | 3 | 2020-08-01 | 2020-09-01 |
1 | 2 | 4 | 2020-10-01 | 2021-01-01 |
1 | 2 | 3 | 2021-02-01 | 2021-05-01 |
As it can be seen attributes can go back to the same values by the time. How can I make this output happen by sql code?
Thank you very much, Regards
Advertisement
Answer
Using CONDITIONAL_TRUE_EVENT windowed function to determine continuous subgroups:
CREATE OR REPLACE TABLE t( pk INT, code_paym INT, code_terms INT, etl_id DATE) AS SELECT 1, 2, 3, '2020-08-01' UNION ALL SELECT 1, 2, 3, '2020-09-01' UNION ALL SELECT 1, 2, 4, '2020-10-01' UNION ALL SELECT 1, 2, 4, '2020-11-01' UNION ALL SELECT 1, 2, 4, '2020-12-01' UNION ALL SELECT 1, 2, 4, '2021-01-01' UNION ALL SELECT 1, 2, 3, '2021-02-01' UNION ALL SELECT 1, 2, 3, '2021-03-01' UNION ALL SELECT 1, 2, 3, '2021-04-01' UNION ALL SELECT 1, 2, 3, '2021-05-01';
Query:
WITH cte AS ( SELECT t.*, CONDITIONAL_TRUE_EVENT(CODE_TERMS != LAG(CODE_TERMS,1,CODE_TERMS) OVER(PARTITION BY PK, CODE_PAYM ORDER BY ETL_ID)) OVER(PARTITION BY PK, CODE_PAYM ORDER BY ETL_ID) AS grp FROM t ) SELECT PK, CODE_PAYM, grp, MIN(ETL_ID) AS valid_from, MAX(ETL_ID) AS valid_to FROM cte GROUP BY PK, CODE_PAYM, grp;
Output: