I have a table:
x
id | start_date | end_date
----------------------------
01 | 2016-02-19 | 2017-03-02
02 | 2017-06-19 | 2018-09-11
03 | 2015-03-19 | 2018-05-02
04 | 2018-02-19 | 2018-01-05
05 | 2014-06-19 | 2018-07-25
and I would like to repeat rows based on the time between start_date
and end_date
, in this case by years extracted from those two date columns. My desired result would resemble:
id | year
=========
01 | 2016
01 | 2017
02 | 2017
02 | 2018
03 | 2015
03 | 2016
03 | 2017
03 | 2018
04 | 2018
05 | 2014
05 | 2015
05 | 2016
05 | 2017
05 | 2018
How can I achieve this in Redshift?
Advertisement
Answer
We can try joining with a calendar table containing all years which would appear in your table:
WITH years AS (
SELECT 2014 AS year UNION ALL
SELECT 2015 UNION ALL
SELECT 2016 UNION ALL
SELECT 2017 UNION ALL
SELECT 2018
)
SELECT
t2.id,
t1.year
FROM years t1
INNER JOIN yourTable t2
ON t1.year BETWEEN DATE_PART('year', t2.start_date) AND DATE_PART('year', t2.end_date)
ORDER BY
t2.id,
t1.year;
Demo
Note: Use DATE_PART(year, t2.start_date)
for Redshift, where the datetime component does not take single quotes.