I have the following table:
Reservations | id | status | created_at | | 1 | Opened | 2019-11-12 11:46:11 | | 1 | Completed | 2019-11-19 23:03:24 | | 1 | Pending | 2019-11-15 12:04:13 | | 2 | Opened | 2019-11-14 11:46:11 | | 2 | Completed | 2019-11-20 23:03:24 | | 2 | Pending | 2019-11-17 12:04:13 |
I also have a table with every calendar day from 2019-11-01 to 2019-12-31.
I need to find out how many occurrences of each status exist per calendar day for the time span listed above.
If a status is Opened on 2019-12-14 and Pending on 2019-12-17, I need to count that it was Opened for every day from 2019-12-14 to 2019-12-17.
Ideal:
|2019-11-12 00:00:00 | Opened | 1 | |2019-11-12 00:00:00 | Pending | 0 | |2019-11-12 00:00:00 | Completed | 0 | |2019-11-13 00:00:00 | Opened | 1 | |2019-11-13 00:00:00 | Pending | 0 | |2019-11-13 00:00:00 | Completed | 0 | |2019-11-14 00:00:00 | Opened | 2 | |2019-11-14 00:00:00 | Pending | 0 | |2019-11-14 00:00:00 | Completed | 0 | |2019-11-15 00:00:00 | Opened | 1 | |2019-11-15 00:00:00 | Pending | 1 | |2019-11-15 00:00:00 | Completed | 0 |
Any help is greatly appreciated.
Edit: The solution from GMB below is very close, but it leaves me with the following table:
| status | created_at | ended_at | | Opened | 2019-11-12 11:46:11 | 2019-11-15 12:04:13 | | Pending | 2019-11-15 12:04:13 | 2019-11-19 23:03:24 | | Completed | 2019-11-19 23:03:24 | | | Opened | 2019-11-14 11:46:11 | 2019-11-17 12:04:13 | | Pending | 2019-11-17 12:04:13 | 2019-11-20 23:03:24 | | Completed | 2019-11-20 23:03:24 | |
How do I add the end date to my range (2019-12-31) to the missing column values?
Advertisement
Answer
Consider the following query:
select c.dt, s.status, count(t.status) from calendar c cross join (select distinct status from reservations) s left join ( select status, created_at, lead(created_at) over(partition by id order by created_at) ended_at from reservations ) t on t.status = s.status and c.dt + interval '1 day' >= t.created_at and c.dt + interval '1 day' < t.ended_at group by c.dt, s.status order by c.dt, s.status
This works by cross-joining the calendar table with the list of distinct status available in the table, and then joining it with a subquery that uses lead()
to get the date of the next status associated to each record. If you have a table of statuses, you can use it instead of the subquery that selects the distinct statuses.
dt | status | count :--------------------- | :-------- | ----: 2019-11-12 00:00:00+00 | Completed | 0 2019-11-12 00:00:00+00 | Opened | 1 2019-11-12 00:00:00+00 | Pending | 0 2019-11-13 00:00:00+00 | Completed | 0 2019-11-13 00:00:00+00 | Opened | 1 2019-11-13 00:00:00+00 | Pending | 0 2019-11-14 00:00:00+00 | Completed | 0 2019-11-14 00:00:00+00 | Opened | 2 2019-11-14 00:00:00+00 | Pending | 0 2019-11-15 00:00:00+00 | Completed | 0 2019-11-15 00:00:00+00 | Opened | 1 2019-11-15 00:00:00+00 | Pending | 1
Note that the DB Fiddle demonstrates how to use handy Postgres function generate_series()
to fill in the calendar table.