I have a table where I register a debt and the paid date:
CREATE TABLE my_table
(
the_debt_id varchar(6) NOT NULL,
the_debt_paid timestamp NOT NULL,
the_debt_due date NOT NULL
)
INSERT INTO my_table
VALUES ('LMUS01', '2019-05-02 09:00:01', '2019-05-02'),
('LMUS01', '2019-06-03 10:45:12', '2019-06-02'),
('LMUS01', '2019-07-01 15:39:58', '2019-07-02'),
('LMUS02', '2019-05-03 19:43:44', '2019-05-07'),
('LMUS02', '2019-06-07 08:37:05', '2019-06-07')
What I want is to aggregate this data per debt_id, payments (the quantity of payments per debt_id), tardiness (if the paid_date > due_date), the first due_date per debt_id and the percentage that each debt was late. This table should give the idea:
the_debt_id payments tardiness first_due_date percentage LMUS01 3 1 2019-05-02 0.33 LMUS02 2 0 2019-05-07 0
So I tried this so far:
WITH t1 AS( SELECT the_debt_id, the_debt_due, the_debt_paid, CASE WHEN the_debt_paid::date > the_debt_due THEN 1 ELSE 0 END AS tardiness FROM my_table), t2 AS( SELECT the_debt_id, sum(tardiness) AS tardiness, count(the_debt_id) AS payments, first_value(the_debt_due) FROM t1 GROUP BY the_debt_id), t3 AS( SELECT *, tardiness/payments::float AS percentage FROM t2) SELECT * FROM t3
I get an error where it says I need an OVER clause, which means that I need a partition but I’m not sure how to combine GROUP BY and PARTITION. Any help will be greatly appreciated.
Advertisement
Answer
Aggregation seems appropriate:
select the_debt_id,
count(*) as payments,
count(*) filter (where the_debt_paid::date > the_debt_due) as num_tardy,
min(the_debt_due) as first_due_date,
avg( (the_debt_paid::date > the_debt_due)::int ) as tardy_ratio
from my_table t
group by the_debt_id;
Here is a db<>fiddle.