I want to consolidate records based on a time gap of 5 minutes between the current row-end date and the next row start date. How can I achieve this T-SQL?
Before Consolidation, the table is like below –
x
+---------+------------------+------------------+
| EventId | StartTime | EndTime |
+---------+------------------+------------------+
| 1 | 2020-11-05 10:05 | 2020-11-05 10:08 |
| 2 | 2020-11-05 10:15 | 2020-11-05 10:17 |
| 3 | 2020-11-05 10:20 | 2020-11-05 10:24 |
| 4 | 2020-11-05 10:25 | 2020-11-05 10:30 |
| 5 | 2020-11-05 10:40 | 2020-11-05 10:48 |
| 6 | 2020-11-05 10:58 | 2020-11-05 11:05 |
+---------+------------------+------------------+
It should look like below after consolidation –
| Consolidated StartTime | Consolidated EndTime |
|------------------------|----------------------|
| 2020-11-05 10:05 | 2020-11-05 10:08 |
| 2020-11-05 10:15 | 2020-11-05 10:30 |
| 2020-11-05 10:40 | 2020-11-05 10:48 |
| 2020-11-05 10:58 | 2020-11-05 11:05 |
Advertisement
Answer
This is a type of gaps-and-islands problem. For this problem, use lag()
and a cumulative sum:
select min(starttime), max(endtime)
from (select t.*,
sum(case when prev_endtime >= dateadd(minute, 5, starttime) then 0 else 1 end) over
(order by eventid) as island_num
from (select t.*,
lag(endtime) over (order by eventid) as prev_endtime
from t
) t
) t
group by island_num;
The lag()
gets the previous endtime
for comparison. The middle query determines when a new “island” starts. This occurs when there is more than 5 minutes between the previous endtime
and current starttime
. The outer query then aggregates to get the final result.