I have the below tables:
1) My Company table
id | c_name | c_code | status ----+------------+----------+-------- 1 | AAAAAAAAAA | AA1234 | Active
2) My User table
id | c_id | u_name | status | emp_id ----+------------+----------+--------+-------- 1 | 1 | XXXXXXXX | Active | 1 2 | 1 | YYYYYYYY | Active | 2
3) My Attendance table
id | u_id | swipe_time | status ----+--------+------------------------+-------- 1 | 1 | 2020-08-20 16:00:00 | IN 2 | 1 | 2020-08-20 20:00:00 | OUT 3 | 1 | 2020-08-20 21:00:00 | IN 4 | 1 | 2020-08-21 01:00:00 | OUT 5 | 1 | 2020-08-21 16:00:00 | IN 6 | 1 | 2020-08-21 19:00:00 | OUT
I need to calculate the attendance grouped by date, u_id like below:
Note: The query parameters would be the “From Date”, “To Date” and “Company ID”
u_id | u_name | date | in_time | out_time | hrs -----+-----------+-------------+----------------------+----------------------+----- 1 | XXXXXXXX | 2020-08-20 | 2020-08-20 16:00:00 | 2020-08-21 01:00:00 | 7 1 | XXXXXXXX | 2020-08-21 | 2020-08-21 16:00:00 | 2020-08-21 19:00:00 | 4 2 | YYYYYYYY | null | null | null | 0
Is this possible in PostgreSQL?
Advertisement
Answer
The tricky part is to expand one row that covers two (calendar) days to two rows and allocating the hours of the “next” day correctly.
The first part is to get a pivot table that combines IN/OUT pairs into a single row.
A simple (yet not very efficient) approach is:
select ain.u_id, ain.swipe_time as time_in, (select min(aout.swipe_time) from attendance aout where aout.u_id = ain.u_id and aout.status = 'OUT' and aout.swipe_time > ain.swipe_time) as time_out from attendance ain where ain.status = 'IN'
The next step is to break up the rows with more than one day into two rows.
This assumes that you never have an IN/OUT pair that covers more than two days!
with inout as ( select ain.u_id, ain.swipe_time as time_in, (select min(aout.swipe_time) from attendance aout where aout.u_id = ain.u_id and aout.status = 'OUT' and aout.swipe_time > ain.swipe_time) as time_out from attendance ain where ain.status = 'IN' ), expanded as ( select u_id, time_in::date as "date", time_in, time_out from inout where time_in::date = time_out::date union all select i.u_id, x.time_in::date as date, x.time_in, x.time_out from inout i cross join lateral ( select i.u_id, i.time_in, i.time_in::date + 1 as time_out union all select i.u_id, i.time_out::date, i.time_out ) x where i.time_out::date > i.time_in::date ) select * from expanded;
The above returns the following for your sample data:
u_id | date | time_in | time_out -----+------------+---------------------+-------------------- 1 | 2020-08-20 | 2020-08-20 16:00:00 | 2020-08-20 20:00:00 1 | 2020-08-20 | 2020-08-20 21:00:00 | 2020-08-21 00:00:00 1 | 2020-08-21 | 2020-08-21 00:00:00 | 2020-08-21 01:00:00 1 | 2020-08-21 | 2020-08-21 16:00:00 | 2020-08-21 19:00:00
How does this work?
So we first select all those rows that start and end on the same day with this part:
select u_id, time_in::date as "date", time_in, time_out from inout where time_in::date = time_out::date
The second part of the union splits up the rows that span two days by using a cross join that generates one row with the original start time and midnight, and another from midnight to the original end time:
select i.u_id, x.time_in::date as date, x.time_in, x.time_out from inout i cross join lateral ( -- this generates a row for the first of the two days select i.u_id, i.time_in, i.time_in::date + 1 as time_out union all -- this generates the row for the next day select i.u_id, i.time_out::date, i.time_out ) x where i.time_out::date > i.time_in::date
At the end the new “expanded” rows are then aggregated by grouping them by user and date and left joined to the users
table to get the username as well.
with inout as ( select ain.u_id, ain.swipe_time as time_in, (select min(aout.swipe_time) from attendance aout where aout.u_id = ain.u_id and aout.status = 'OUT' and aout.swipe_time > ain.swipe_time) as time_out from attendance ain where ain.status = 'IN' ), expanded as ( select u_id, time_in::date as "date", time_in, time_out from inout where time_in::date = time_out::date union all select i.u_id, x.time_in::date as date, x.time_in, x.time_out from inout i cross join lateral ( select i.u_id, i.time_in, i.time_in::date + 1 as time_out union all select i.u_id, i.time_out::date, i.time_out ) x where i.time_out::date > i.time_in::date ) select u.id, u.u_name, e."date", min(e.time_in) as time_in, max(e.time_out) as time_out, sum(e.time_out - e.time_in) as duration from users u left join expanded e on u.id = e.u_id group by u.id, u.u_name, e."date" order by u.id, e."date";
Which then results in:
u_id | date | time_in | time_out | duration -----+------------+---------------------+---------------------+---------------------------------------------- 1 | 2020-08-20 | 2020-08-20 16:00:00 | 2020-08-21 00:00:00 | 0 years 0 mons 0 days 7 hours 0 mins 0.0 secs 1 | 2020-08-21 | 2020-08-21 00:00:00 | 2020-08-21 19:00:00 | 0 years 0 mons 0 days 4 hours 0 mins 0.0 secs
The “duration” column is an interval
which you can format to your liking.