Grouping shift data by 7-day windows in SQL Server 2012

What I want to do is to calculate the number of shifts and hours worked by each employee in any given 7-day period. In order to achieve this, I need to identify and group ‘islands’ of shifts. Note that this 7-day period is not tied to a calendar week and the beginning and ending of this 7-day period would vary from employee to employee. This is sets it apart from other similar questions asked her in the past.

I have a table like this:

Person ID   Start Date  End Date    Start time      End time    Hours Worked
12345       06-07-20    06-07-20    6:00 AM         7:45 AM     1.75
12345       06-07-20    06-07-20    8:15 AM         8:45 AM     0.50
12345       06-07-20    06-07-20    9:19 AM         9:43 AM     0.40
12345       08-07-20    08-07-20    12:00 AM        12:39 AM    0.65
12345       09-07-20    09-07-20    10:05 PM        11:59 PM    1.90
12345       11-07-20    11-07-20    4:39 PM         4:54 PM     0.25
12345       22-07-20    22-07-20    7:00 AM         7:30 AM     0.50
12345       23-07-20    23-07-20    1:00 PM         3:00 PM     2.00
12345       24-07-20    24-07-20    9:14 AM         9:35 AM     0.35
12345       27-07-20    27-07-20    4:00 PM         6:00 PM     2.00
12345       27-07-20    27-07-20    2:00 PM         4:00 PM     2.00
12345       28-07-20    28-07-20    9:00 AM         10:00 AM    1.00
12345       28-07-20    28-07-20    4:39 AM         4:59 AM     0.34

​x
 
Person ID   Start Date  End Date    Start time      End time    Hours Worked12345       06-07-20    06-07-20    6:00 AM         7:45 AM     1.7512345       06-07-20    06-07-20    8:15 AM         8:45 AM     0.5012345       06-07-20    06-07-20    9:19 AM         9:43 AM     0.4012345       08-07-20    08-07-20    12:00 AM        12:39 AM    0.6512345       09-07-20    09-07-20    10:05 PM        11:59 PM    1.9012345       11-07-20    11-07-20    4:39 PM         4:54 PM     0.2512345       22-07-20    22-07-20    7:00 AM         7:30 AM     0.5012345       23-07-20    23-07-20    1:00 PM         3:00 PM     2.0012345       24-07-20    24-07-20    9:14 AM         9:35 AM     0.3512345       27-07-20    27-07-20    4:00 PM         6:00 PM     2.0012345       27-07-20    27-07-20    2:00 PM         4:00 PM     2.0012345       28-07-20    28-07-20    9:00 AM         10:00 AM    1.0012345       28-07-20    28-07-20    4:39 AM         4:59 AM     0.34​

I want group and summarise the data above like this:

Person ID   From        To          Number of shifts    Number of Hours
12345       06-07-20    11-07-20    6                   5.45
12345       22-07-20    28-07-20    7                   8.19

 
Person ID   From        To          Number of shifts    Number of Hours12345       06-07-20    11-07-20    6                   5.4512345       22-07-20    28-07-20    7                   8.19​

Note that the first grouping for employee 12345 starts on 06-07-20 and ends on 11-07-20 because these shifts fall within the 06-07-20 – 13-07-20 7-day window.

The next day 7-day window is from 22-07-20 to 28-07-20, which means that the start date for the 7-day window has to be dynamic and based on the data i.e. not constant which makes this a complex task.

Also note that an employee may work multiple shifts in a day and that the shifts may not be consecutive.

I was playing around with using DATEDIFF() with LAG() and LEAD() but was unable to get to where I want. Any help would be appreciated.

Answer

I think you need a recursive CTE gor this. The idea is to enumerate the shifts of each person, and then iteratively walk the dataset, while keeping track of the first date of the period – when there is more than 7 days between the start of a period and the current date, the start date resets, and a new group starts.

with recursive 
    data as (select t.*, row_number() over(partition by personid order by start_date) rn from mytable t)
    cte as (
        select personid, start_date, start_date end_date, hours_worked, rn 
        from data 
        where rn = 1
        union all
        select 
            c.personid, 
            case when d.start_date > dateadd(day, 7, c.start_date) then d.start_date else c.start_date end,
            d.start_date,
            d.hours_worked,
            d.rn
        from cte c
        inner join data d on d.personid = c.personid and d.rn = c.rn + 1
    )
select personid, start_date, max(start_date) end_date, count(*) no_shifts, sum(hours_worked)
from cte
group by personid, start_date

 
with recursive     data as (select t.*, row_number() over(partition by personid order by start_date) rn from mytable t)    cte as (        select personid, start_date, start_date end_date, hours_worked, rn         from data         where rn = 1        union all        select             c.personid,             case when d.start_date > dateadd(day, 7, c.start_date) then d.start_date else c.start_date end,            d.start_date,            d.hours_worked,            d.rn        from cte c        inner join data d on d.personid = c.personid and d.rn = c.rn + 1    )select personid, start_date, max(start_date) end_date, count(*) no_shifts, sum(hours_worked)from ctegroup by personid, start_date​

This assumes that:

dates do not span over multiple days, as shown in your sample data
dates are stored as date datatype, and times as time

Advertisement

Answer