I have data in the following format:
ID | Period | Date |
---|---|---|
1 | 1 | 2020-09-05 |
1 | 2 | 2020-10-01 |
1 | 3 | 2020-10-30 |
2 | 2 | 2020-10-01 |
2 | 4 | 2020-12-05 |
2 | 6 | 2021-02-05 |
3 | 1 | 2020-09-05 |
I want to calculate a streak grouping by both ID and period. The first date instance for each group should start at one, and if the date in the next row is less than 30 days from the first row, +1 to the streak counter. This should reset to 1 if the next row is 30 or more days.
This is my desired output:
ID | Period | Date | Counter |
---|---|---|---|
1 | 1 | 2020-09-05 | 1 |
1 | 2 | 2020-10-01 | 2 |
1 | 3 | 2020-10-30 | 3 |
2 | 2 | 2020-10-01 | 1 |
2 | 4 | 2020-12-05 | 1 |
2 | 6 | 2021-02-05 | 1 |
3 | 1 | 2020-09-05 | 1 |
I’m completely stuck on this – any help would be greatly appreciated. I’m using Microsoft SQL server.
EDIT: Actually using Microsoft SQL Server
Advertisement
Answer
From Oracle 12c, you can use MATCH_RECOGNIZE
to perform a row-by-row comparison:
SELECT id, period, "DATE", counter FROM table_name MATCH_RECOGNIZE( PARTITION BY id ORDER BY "DATE" MEASURES COUNT(*) AS counter ALL ROWS PER MATCH PATTERN (any_date streak_date*) DEFINE streak_date AS "DATE" <= PREV("DATE") + INTERVAL '30' DAY -- AND PREV(period) < period )
Which, for the sample data:
CREATE TABLE table_name (ID, Period, "DATE") AS SELECT 1, 1, DATE '2020-09-05' FROM DUAL UNION ALL SELECT 1, 2, DATE '2020-10-01' FROM DUAL UNION ALL SELECT 1, 3, DATE '2020-10-30' FROM DUAL UNION ALL SELECT 2, 2, DATE '2020-10-01' FROM DUAL UNION ALL SELECT 2, 4, DATE '2020-12-05' FROM DUAL UNION ALL SELECT 2, 6, DATE '2021-02-05' FROM DUAL UNION ALL SELECT 3, 1, DATE '2020-09-05' FROM DUAL
Note: DATE
is a reserved word and it is bad practice to use it as a column name.
Outputs:
ID PERIOD DATE COUNTER 1 1 2020-09-05 00:00:00 1 1 2 2020-10-01 00:00:00 2 1 3 2020-10-30 00:00:00 3 2 2 2020-10-01 00:00:00 1 2 4 2020-12-05 00:00:00 1 2 6 2021-02-05 00:00:00 1 3 1 2020-09-05 00:00:00 1
db<>fiddle here