I have a table:
id | start_date | end_date ---------------------------- 01 | 2016-02-19 | 2017-03-02 02 | 2017-06-19 | 2018-09-11 03 | 2015-03-19 | 2018-05-02 04 | 2018-02-19 | 2018-01-05 05 | 2014-06-19 | 2018-07-25
and I would like to repeat rows based on the time between start_date
and end_date
, in this case by years extracted from those two date columns. My desired result would resemble:
id | year ========= 01 | 2016 01 | 2017 02 | 2017 02 | 2018 03 | 2015 03 | 2016 03 | 2017 03 | 2018 04 | 2018 05 | 2014 05 | 2015 05 | 2016 05 | 2017 05 | 2018
How can I achieve this in Redshift?
Advertisement
Answer
We can try joining with a calendar table containing all years which would appear in your table:
WITH years AS ( SELECT 2014 AS year UNION ALL SELECT 2015 UNION ALL SELECT 2016 UNION ALL SELECT 2017 UNION ALL SELECT 2018 ) SELECT t2.id, t1.year FROM years t1 INNER JOIN yourTable t2 ON t1.year BETWEEN DATE_PART('year', t2.start_date) AND DATE_PART('year', t2.end_date) ORDER BY t2.id, t1.year;
Note: Use DATE_PART(year, t2.start_date)
for Redshift, where the datetime component does not take single quotes.