Skip to content
Advertisement

T-SQL filtering records based on dates and time difference with other records

I have a table for which I have to perform a rather complex filter: first a filter by date is applied, but then records from the previous and next days should be included if their time difference does not exceed 8 hours compared to its prev or next record (depending if the date is less or greater than filter date).

For those adjacent days the selection should stop at the first record that does not satisfy this condition.

This is how my raw data looks like:

Id Desc EntryDate
1 Event type 1 2021-03-12 21:55:00.000
2 Event type 1 2021-03-12 01:10:00.000
3 Event type 1 2021-03-11 20:17:00.000
4 Event type 1 2021-03-11 05:04:00.000
5 Event type 1 2021-03-10 23:58:00.000
6 Event type 1 2021-03-10 11:01:00.000
7 Event type 1 2021-03-10 10:00:00.000

In this example set, if my filter date is ‘2021-03-11’, my expected result set should be all records from that day plus adjacent records from 03-12 and 03-10 that satisfy the 8 hours condition. Note how record with Id 7 is not be included because record with Id 6 does not comply:

Id EntryDate
2 2021-03-12 01:10:00.000
3 2021-03-11 20:17:00.000
4 2021-03-11 05:04:00.000
5 2021-03-10 23:58:00.000

Need advice how to write this complex query

Advertisement

Answer

This is a variant of gaps-and-islands. Define the difference . . . and then groups based on the differences:

with e as (
      select t.*
      from (select t.*,
                   sum(case when prev_entrydate > dateadd(hour, -8, entrydate) then 0 else 1 end) over (order by entrydate) as grp
            from (select t.*,
                         lag(entrydate) over (order by entrydate) as prev_entrydate
                  from t
                 ) t
     )
select e.*
from e.*
where e.grp in (select e2.grp
                from t e2
                where date(e2.entrydate) = @filterdate
               );

Note: I’m not sure exactly how filter date is applied. This assumes that it is any events on the entire day, which means that there might be multiple groups. If there is only one group (say the first group on the day), the query can be simplified a bit from a performance perspective.

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement