I have data from issues in a table with the following schema.
x
CREATE TABLE if not EXISTS issues (
id int,
created_at DATE,
deleted_at DATE
);
INSERT INTO issues VALUES
(1, '2020-01-01', '2020-01-02'),
(2, '2020-01-02', '2020-01-05'),
(3, '2020-01-04', '2020-01-03'),
(4, '2020-01-04', '2020-01-07'),
(5, '2020-01-04', '2020-01-07');
I want to know how many issues are open every day using PostgreSQL. We consider an issue being open at date x if:
x >= created_at
x <= deleted_at
There may be days where no issues were created or deleted, as in the example. How can I do the query to obtain how many issues are open every day?
Advertisement
Answer
A simple method uses generate_series()
:
select gs.dte, count(*)
from issues i cross join lateral
generate_series(i.created_at, i.deleted_at, interval '1 day') gs(dte)
group by gs.dte
order by gs.dte;
Here is a db<>fiddle.
The above skips dates with no issues. One way to fix this would be to generate all dates and then:
select gs.dte, count(i.id)
from (select min(created_at) as first_date, max(deleted_at) as last_date
from issues i
) m cross join lateral
generate_series(m.first_date, m.last_date, interval '1 day') gs(dte) left join
issues i
on gs.dte >= i.created_at and gs.dte <= i.deleted_at
group by gs.dte
order by gs.dte;
However, the first method should be more efficient if this is not necessary.