I have a log table like below and want to simplfy it by getting min start date and max end date for consecutive Status values for each Id. I tried many window function combinations but no luck.
This is what I have:
This is what want to see:
Advertisement
Answer
This is a typical gaps-and-islands problem. You want to aggregate groups of consecutive records that have the same Id
and Status
.
No need for recursion, here is one way to solve it using window functions:
select Id, Status, min(StartDate) StartDate, max(EndDate) EndDate from ( select t.*, row_number() over(partition by id order by StartDate) rn1, row_number() over(partition by id, status order by StartDate) rn2 from mytable t ) t group by Id, Status, rn1 - rn2 order by Id, min(StartDate)
The query works by ranking records over two different partitions (by Id
, and by Id
and Status
). The difference between the ranks gives you the group each record belongs to. You can run the subquery independently to see what it returns and understand the logic.
Id | Status | StartDate | EndDate -: | :----- | :------------------ | :------------------ 1 | B | 07/02/2019 00:00:00 | 18/02/2019 00:00:00 1 | C | 18/02/2019 00:00:00 | 10/03/2019 00:00:00 1 | B | 10/03/2019 00:00:00 | 01/04/2019 00:00:00 2 | A | 05/02/2019 00:00:00 | 22/04/2019 00:00:00 2 | D | 22/04/2019 00:00:00 | 05/05/2019 00:00:00 2 | A | 05/05/2019 00:00:00 | 30/06/2019 00:00:00