Skip to content
Advertisement

Postgres – calculate total working hours based on IN and OUT entry

I have the below tables:

1) My Company table

 id |   c_name   |  c_code  | status 
----+------------+----------+--------
  1 | AAAAAAAAAA |  AA1234  | Active 


2) My User table

 id |    c_id    | u_name   | status | emp_id 
----+------------+----------+--------+--------
  1 |      1     | XXXXXXXX | Active |    1   
  2 |      1     | YYYYYYYY | Active |    2   


3) My Attendance table

 id |  u_id  |        swipe_time      | status 
----+--------+------------------------+--------
  1 |   1    |  2020-08-20 16:00:00   | IN     
  2 |   1    |  2020-08-20 20:00:00   | OUT    
  3 |   1    |  2020-08-20 21:00:00   | IN     
  4 |   1    |  2020-08-21 01:00:00   | OUT    
  5 |   1    |  2020-08-21 16:00:00   | IN     
  6 |   1    |  2020-08-21 19:00:00   | OUT    


I need to calculate the attendance grouped by date, u_id like below:
Note: The query parameters would be the “From Date”, “To Date” and “Company ID”

u_id |   u_name  |     date    |        in_time       |        out_time      | hrs 
-----+-----------+-------------+----------------------+----------------------+-----
 1   |  XXXXXXXX | 2020-08-20  |  2020-08-20 16:00:00 |  2020-08-21 01:00:00 |  7  
 1   |  XXXXXXXX | 2020-08-21  |  2020-08-21 16:00:00 |  2020-08-21 19:00:00 |  4  
 2   |  YYYYYYYY |     null    |        null          |        null          |  0  


Is this possible in PostgreSQL?

Advertisement

Answer

The tricky part is to expand one row that covers two (calendar) days to two rows and allocating the hours of the “next” day correctly.

The first part is to get a pivot table that combines IN/OUT pairs into a single row.

A simple (yet not very efficient) approach is:

  select ain.u_id, 
         ain.swipe_time as time_in,
         (select min(aout.swipe_time)
          from attendance aout
          where aout.u_id = ain.u_id
            and aout.status = 'OUT'
            and aout.swipe_time > ain.swipe_time) as time_out
  from attendance ain
  where ain.status = 'IN'

The next step is to break up the rows with more than one day into two rows.

This assumes that you never have an IN/OUT pair that covers more than two days!

with inout as (
  select ain.u_id, 
         ain.swipe_time as time_in,
         (select min(aout.swipe_time)
          from attendance aout
          where aout.u_id = ain.u_id
            and aout.status = 'OUT'
            and aout.swipe_time > ain.swipe_time) as time_out
  from attendance ain
  where ain.status = 'IN'
), expanded as (
  select u_id, 
         time_in::date as "date", 
         time_in,
         time_out
  from inout     
  where time_in::date = time_out::date  
  union all
  select i.u_id, 
         x.time_in::date as date, 
         x.time_in,
         x.time_out
  from inout i   
    cross join lateral (
       select i.u_id, 
              i.time_in, 
              i.time_in::date + 1 as time_out
       union all
       select i.u_id, 
              i.time_out::date, 
              i.time_out
    ) x
  where i.time_out::date > i.time_in::date  
)
select *
from expanded;

The above returns the following for your sample data:

u_id | date       | time_in             | time_out           
-----+------------+---------------------+--------------------
   1 | 2020-08-20 | 2020-08-20 16:00:00 | 2020-08-20 20:00:00
   1 | 2020-08-20 | 2020-08-20 21:00:00 | 2020-08-21 00:00:00
   1 | 2020-08-21 | 2020-08-21 00:00:00 | 2020-08-21 01:00:00
   1 | 2020-08-21 | 2020-08-21 16:00:00 | 2020-08-21 19:00:00

How does this work?

So we first select all those rows that start and end on the same day with this part:

  select u_id, 
         time_in::date as "date", 
         time_in,
         time_out
  from inout     
  where time_in::date = time_out::date  

The second part of the union splits up the rows that span two days by using a cross join that generates one row with the original start time and midnight, and another from midnight to the original end time:

  select i.u_id, 
         x.time_in::date as date, 
         x.time_in,
         x.time_out
  from inout i   
    cross join lateral (
       -- this generates a row for the first of the two days
       select i.u_id, 
              i.time_in, 
              i.time_in::date + 1 as time_out
       union all
       -- this generates the row for the next day
       select i.u_id, 
              i.time_out::date, 
              i.time_out
    ) x
  where i.time_out::date > i.time_in::date 

At the end the new “expanded” rows are then aggregated by grouping them by user and date and left joined to the users table to get the username as well.

with inout as (
  select ain.u_id, 
         ain.swipe_time as time_in,
         (select min(aout.swipe_time)
          from attendance aout
          where aout.u_id = ain.u_id
            and aout.status = 'OUT'
            and aout.swipe_time > ain.swipe_time) as time_out
  from attendance ain
  where ain.status = 'IN'
), expanded as (
  select u_id, 
         time_in::date as "date", 
         time_in,
         time_out
  from inout     
  where time_in::date = time_out::date  
  union all
  select i.u_id, 
         x.time_in::date as date, 
         x.time_in,
         x.time_out
  from inout i   
    cross join lateral (
       select i.u_id, 
              i.time_in, 
              i.time_in::date + 1 as time_out
       union all
       select i.u_id, 
              i.time_out::date, 
              i.time_out
    ) x
  where i.time_out::date > i.time_in::date  
)
select u.id,
       u.u_name,
       e."date", 
       min(e.time_in) as time_in,
       max(e.time_out) as time_out,
       sum(e.time_out - e.time_in) as duration
from users u
  left join expanded e on u.id = e.u_id
group by u.id, u.u_name, e."date"
order by u.id, e."date";

Which then results in:

u_id | date       | time_in             | time_out            | duration                                     
-----+------------+---------------------+---------------------+----------------------------------------------
   1 | 2020-08-20 | 2020-08-20 16:00:00 | 2020-08-21 00:00:00 | 0 years 0 mons 0 days 7 hours 0 mins 0.0 secs
   1 | 2020-08-21 | 2020-08-21 00:00:00 | 2020-08-21 19:00:00 | 0 years 0 mons 0 days 4 hours 0 mins 0.0 secs

The “duration” column is an interval which you can format to your liking.

Online example

User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement