I want to consolidate records based on a time gap of 5 minutes between the current row-end date and the next row start date. How can I achieve this T-SQL?
Before Consolidation, the table is like below –
+---------+------------------+------------------+ | EventId | StartTime | EndTime | +---------+------------------+------------------+ | 1 | 2020-11-05 10:05 | 2020-11-05 10:08 | | 2 | 2020-11-05 10:15 | 2020-11-05 10:17 | | 3 | 2020-11-05 10:20 | 2020-11-05 10:24 | | 4 | 2020-11-05 10:25 | 2020-11-05 10:30 | | 5 | 2020-11-05 10:40 | 2020-11-05 10:48 | | 6 | 2020-11-05 10:58 | 2020-11-05 11:05 | +---------+------------------+------------------+
It should look like below after consolidation –
| Consolidated StartTime | Consolidated EndTime | |------------------------|----------------------| | 2020-11-05 10:05 | 2020-11-05 10:08 | | 2020-11-05 10:15 | 2020-11-05 10:30 | | 2020-11-05 10:40 | 2020-11-05 10:48 | | 2020-11-05 10:58 | 2020-11-05 11:05 |
Advertisement
Answer
This is a type of gaps-and-islands problem. For this problem, use lag() and a cumulative sum:
select min(starttime), max(endtime)
from (select t.*,
sum(case when prev_endtime >= dateadd(minute, 5, starttime) then 0 else 1 end) over
(order by eventid) as island_num
from (select t.*,
lag(endtime) over (order by eventid) as prev_endtime
from t
) t
) t
group by island_num;
The lag() gets the previous endtime for comparison. The middle query determines when a new “island” starts. This occurs when there is more than 5 minutes between the previous endtime and current starttime. The outer query then aggregates to get the final result.