What I want to do is to calculate the number of shifts and hours worked by each employee in any given 7-day period. In order to achieve this, I need to identify and group ‘islands’ of shifts. Note that this 7-day period is not tied to a calendar week and the beginning and ending of this 7-day period would vary from employee to employee. This is sets it apart from other similar questions asked her in the past.
I have a table like this:
Person ID Start Date End Date Start time End time Hours Worked
12345 06-07-20 06-07-20 6:00 AM 7:45 AM 1.75
12345 06-07-20 06-07-20 8:15 AM 8:45 AM 0.50
12345 06-07-20 06-07-20 9:19 AM 9:43 AM 0.40
12345 08-07-20 08-07-20 12:00 AM 12:39 AM 0.65
12345 09-07-20 09-07-20 10:05 PM 11:59 PM 1.90
12345 11-07-20 11-07-20 4:39 PM 4:54 PM 0.25
12345 22-07-20 22-07-20 7:00 AM 7:30 AM 0.50
12345 23-07-20 23-07-20 1:00 PM 3:00 PM 2.00
12345 24-07-20 24-07-20 9:14 AM 9:35 AM 0.35
12345 27-07-20 27-07-20 4:00 PM 6:00 PM 2.00
12345 27-07-20 27-07-20 2:00 PM 4:00 PM 2.00
12345 28-07-20 28-07-20 9:00 AM 10:00 AM 1.00
12345 28-07-20 28-07-20 4:39 AM 4:59 AM 0.34
I want group and summarise the data above like this:
Person ID From To Number of shifts Number of Hours
12345 06-07-20 11-07-20 6 5.45
12345 22-07-20 28-07-20 7 8.19
Note that the first grouping for employee 12345
starts on 06-07-20
and ends on 11-07-20
because these shifts fall within the 06-07-20
– 13-07-20
7-day window.
The next day 7-day window is from 22-07-20
to 28-07-20
, which means that the start date for the 7-day window has to be dynamic and based on the data i.e. not constant which makes this a complex task.
Also note that an employee may work multiple shifts in a day and that the shifts may not be consecutive.
I was playing around with using DATEDIFF()
with LAG()
and LEAD()
but was unable to get to where I want. Any help would be appreciated.
Advertisement
Answer
I think you need a recursive CTE gor this. The idea is to enumerate the shifts of each person, and then iteratively walk the dataset, while keeping track of the first date of the period – when there is more than 7 days between the start of a period and the current date, the start date resets, and a new group starts.
with recursive
data as (select t.*, row_number() over(partition by personid order by start_date) rn from mytable t)
cte as (
select personid, start_date, start_date end_date, hours_worked, rn
from data
where rn = 1
union all
select
c.personid,
case when d.start_date > dateadd(day, 7, c.start_date) then d.start_date else c.start_date end,
d.start_date,
d.hours_worked,
d.rn
from cte c
inner join data d on d.personid = c.personid and d.rn = c.rn + 1
)
select personid, start_date, max(start_date) end_date, count(*) no_shifts, sum(hours_worked)
from cte
group by personid, start_date
This assumes that:
dates do not span over multiple days, as shown in your sample data
dates are stored as
date
datatype, and times astime