Skip to content
Advertisement

Repeat Rows Between Date Values in Redshift

I have a table:

id | start_date | end_date
----------------------------
01 | 2016-02-19 | 2017-03-02
02 | 2017-06-19 | 2018-09-11
03 | 2015-03-19 | 2018-05-02
04 | 2018-02-19 | 2018-01-05
05 | 2014-06-19 | 2018-07-25

and I would like to repeat rows based on the time between start_date and end_date, in this case by years extracted from those two date columns. My desired result would resemble:

id | year
=========
01 | 2016
01 | 2017
02 | 2017
02 | 2018
03 | 2015
03 | 2016
03 | 2017
03 | 2018
04 | 2018
05 | 2014
05 | 2015
05 | 2016
05 | 2017
05 | 2018

How can I achieve this in Redshift?

Advertisement

Answer

We can try joining with a calendar table containing all years which would appear in your table:

WITH years AS (
    SELECT 2014 AS year UNION ALL
    SELECT 2015 UNION ALL
    SELECT 2016 UNION ALL
    SELECT 2017 UNION ALL
    SELECT 2018
)

SELECT
    t2.id,
    t1.year
FROM years t1
INNER JOIN yourTable t2
    ON t1.year BETWEEN DATE_PART('year', t2.start_date) AND DATE_PART('year', t2.end_date)
ORDER BY
    t2.id,
    t1.year;

enter image description here

Demo

Note: Use DATE_PART(year, t2.start_date) for Redshift, where the datetime component does not take single quotes.

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement