I want to consolidate records based on a time gap of 5 minutes between the current row-end date and the next row start date. How can I achieve this T-SQL?
Before Consolidation, the table is like below –
+---------+------------------+------------------+ | EventId | StartTime | EndTime | +---------+------------------+------------------+ | 1 | 2020-11-05 10:05 | 2020-11-05 10:08 | | 2 | 2020-11-05 10:15 | 2020-11-05 10:17 | | 3 | 2020-11-05 10:20 | 2020-11-05 10:24 | | 4 | 2020-11-05 10:25 | 2020-11-05 10:30 | | 5 | 2020-11-05 10:40 | 2020-11-05 10:48 | | 6 | 2020-11-05 10:58 | 2020-11-05 11:05 | +---------+------------------+------------------+
It should look like below after consolidation –
| Consolidated StartTime | Consolidated EndTime | |------------------------|----------------------| | 2020-11-05 10:05 | 2020-11-05 10:08 | | 2020-11-05 10:15 | 2020-11-05 10:30 | | 2020-11-05 10:40 | 2020-11-05 10:48 | | 2020-11-05 10:58 | 2020-11-05 11:05 |
Advertisement
Answer
This is a type of gaps-and-islands problem. For this problem, use lag()
and a cumulative sum:
select min(starttime), max(endtime) from (select t.*, sum(case when prev_endtime >= dateadd(minute, 5, starttime) then 0 else 1 end) over (order by eventid) as island_num from (select t.*, lag(endtime) over (order by eventid) as prev_endtime from t ) t ) t group by island_num;
The lag()
gets the previous endtime
for comparison. The middle query determines when a new “island” starts. This occurs when there is more than 5 minutes between the previous endtime
and current starttime
. The outer query then aggregates to get the final result.