Skip to content
Advertisement

PostgreSQL count of each status per day

I have the following table:

Reservations
| id | status    | created_at          |
|  1 | Opened    | 2019-11-12 11:46:11 |
|  1 | Completed | 2019-11-19 23:03:24 |
|  1 | Pending   | 2019-11-15 12:04:13 |
|  2 | Opened    | 2019-11-14 11:46:11 |
|  2 | Completed | 2019-11-20 23:03:24 |
|  2 | Pending   | 2019-11-17 12:04:13 |

I also have a table with every calendar day from 2019-11-01 to 2019-12-31.

I need to find out how many occurrences of each status exist per calendar day for the time span listed above.

If a status is Opened on 2019-12-14 and Pending on 2019-12-17, I need to count that it was Opened for every day from 2019-12-14 to 2019-12-17.

Ideal:

|2019-11-12 00:00:00 | Opened    | 1 |
|2019-11-12 00:00:00 | Pending   | 0 |
|2019-11-12 00:00:00 | Completed | 0 |
|2019-11-13 00:00:00 | Opened    | 1 |
|2019-11-13 00:00:00 | Pending   | 0 |
|2019-11-13 00:00:00 | Completed | 0 |
|2019-11-14 00:00:00 | Opened    | 2 |
|2019-11-14 00:00:00 | Pending   | 0 |
|2019-11-14 00:00:00 | Completed | 0 |
|2019-11-15 00:00:00 | Opened    | 1 |
|2019-11-15 00:00:00 | Pending   | 1 |
|2019-11-15 00:00:00 | Completed | 0 |

Any help is greatly appreciated.

Edit: The solution from GMB below is very close, but it leaves me with the following table:

| status    | created_at          | ended_at            |
| Opened    | 2019-11-12 11:46:11 | 2019-11-15 12:04:13 |
| Pending   | 2019-11-15 12:04:13 | 2019-11-19 23:03:24 |
| Completed | 2019-11-19 23:03:24 |                     |
| Opened    | 2019-11-14 11:46:11 | 2019-11-17 12:04:13 |
| Pending   | 2019-11-17 12:04:13 | 2019-11-20 23:03:24 |
| Completed | 2019-11-20 23:03:24 |                     |

How do I add the end date to my range (2019-12-31) to the missing column values?

Advertisement

Answer

Consider the following query:

select 
    c.dt,
    s.status,
    count(t.status)
from 
    calendar c
    cross join (select distinct status from reservations) s
    left join (
        select 
            status, 
            created_at, 
            lead(created_at) over(partition by id order by created_at) ended_at
        from reservations
    ) t
        on  t.status = s.status
        and c.dt  + interval '1 day' >= t.created_at
        and c.dt + interval '1 day' < t.ended_at
group by c.dt, s.status
order by c.dt, s.status

This works by cross-joining the calendar table with the list of distinct status available in the table, and then joining it with a subquery that uses lead() to get the date of the next status associated to each record. If you have a table of statuses, you can use it instead of the subquery that selects the distinct statuses.

Demo on DB Fiddle:

dt                     | status    | count
:--------------------- | :-------- | ----:
2019-11-12 00:00:00+00 | Completed |     0
2019-11-12 00:00:00+00 | Opened    |     1
2019-11-12 00:00:00+00 | Pending   |     0
2019-11-13 00:00:00+00 | Completed |     0
2019-11-13 00:00:00+00 | Opened    |     1
2019-11-13 00:00:00+00 | Pending   |     0
2019-11-14 00:00:00+00 | Completed |     0
2019-11-14 00:00:00+00 | Opened    |     2
2019-11-14 00:00:00+00 | Pending   |     0
2019-11-15 00:00:00+00 | Completed |     0
2019-11-15 00:00:00+00 | Opened    |     1
2019-11-15 00:00:00+00 | Pending   |     1

Note that the DB Fiddle demonstrates how to use handy Postgres function generate_series() to fill in the calendar table.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement