I have a table that specifies exactly what date and time each employee was in a particular office.
EmployeeTable
looks like this:
id | EmployeeID | DateP | TimeP |
---|---|---|---|
1 | 11111 | 1397/01/02 | 01:30 |
2 | 11111 | 1398/05/09 | 05:30 |
3 | 11111 | 1398/06/07 | 05:10 |
4 | 22222 | 1398/08/09 | 06:12 |
5 | 22222 | 1399/02/01 | 07:15 |
6 | 11111 | 1399/07/02 | 08:51 |
7 | 11111 | 1399/08/06 | 12:20 |
8 | 33333 | 1399/09/04 | 20:01 |
9 | 33333 | 1399/12/08 | 22:05 |
10 | 33333 | 1400/01/01 | 23:11 |
11 | 33333 | 1400/02/05 | 14:10 |
12 | 22222 | 1400/04/05 | 16:25 |
I want exactly select Min and Max date and time for each Employee when present in a office:
id | EmployeeID | MinDateP | TimeMinDateP | MaxDateP | TimeMaxDateP |
---|---|---|---|---|---|
1 | 11111 | 1397/01/02 | 01:30 | 1398/06/07 | 05:10 |
2 | 22222 | 1398/08/09 | 06:12 | 1399/02/01 | 07:15 |
3 | 11111 | 1399/07/02 | 08:51 | 1399/08/06 | 12:20 |
4 | 33333 | 1399/09/04 | 20:01 | 1400/02/05 | 14:10 |
5 | 22222 | 1400/04/05 | 16:25 | 1400/04/05 | 16:25 |
My SQL code is:
with tab1 as ( select * from EmployeeTable ), tab2 as ( select t1.*, case when lag(t1.EmployeeID) over(order by t1.id) is null then 1 when lag(t1.EmployeeID) over(order by t1.id) = t1.EmployeeID then 0 else 1 end lg from tab1 t1 ) , tab3 as ( select t1.*, sum(t1.lg) over(order by t1.id) grp from tab2 t1 ) select t1.EmployeeID, min(t1.DateP) as min, TimeP, max(t1.DateP)as max, TimeP from tab3 t1 group by t1.EmployeeID, t1.grp
But above codes has error. Can every body help me?
Advertisement
Answer
This is a gaps and islands problem. One approach to solve this uses the difference in row numbers method:
WITH cte AS ( SELECT *, ROW_NUMBER() OVER (ORDER BY DateP, TimeP) rn1, ROW_NUMBER() OVER (PARTITION BY EmployeeID ORDER BY DateP, TimeP) rn2 FROM EmployeeTable ), cte2 AS ( SELECT *, ROW_NUMBER() OVER (PARTITION BY EmployeeID, rn1 - rn2 ORDER BY DateP, TimeP) rn_first, ROW_NUMBER() OVER (PARTITION BY EmployeeID, rn1 - rn2 ORDER BY DateP DESC, TimeP DESC) rn_last FROM cte ) SELECT EmployeeID, MAX(CASE WHEN rn_first = 1 THEN DateP END) AS MinDateP, MAX(CASE WHEN rn_first = 1 THEN TimeP END) AS TimeMinDateP, MAX(CASE WHEN rn_last = 1 THEN DateP END) AS MaxDateP, MAX(CASE WHEN rn_last = 1 THEN TimeP END ) AS TimeMaxDateP FROM cte2 GROUP BY EmployeeID, rn1 - rn2 ORDER BY MIN(DateP), MIN(TimeP);
Note that the logic in the second CTE would be totally unnecessary if you were using a single datetime column to represent the date and time. It is usually not beneficial to separate date and time as you are currently doing.