I have a table where I register a debt and the paid date:
CREATE TABLE my_table ( the_debt_id varchar(6) NOT NULL, the_debt_paid timestamp NOT NULL, the_debt_due date NOT NULL ) INSERT INTO my_table VALUES ('LMUS01', '2019-05-02 09:00:01', '2019-05-02'), ('LMUS01', '2019-06-03 10:45:12', '2019-06-02'), ('LMUS01', '2019-07-01 15:39:58', '2019-07-02'), ('LMUS02', '2019-05-03 19:43:44', '2019-05-07'), ('LMUS02', '2019-06-07 08:37:05', '2019-06-07')
What I want is to aggregate this data per debt_id, payments (the quantity of payments per debt_id), tardiness (if the paid_date > due_date), the first due_date per debt_id and the percentage that each debt was late. This table should give the idea:
the_debt_id payments tardiness first_due_date percentage LMUS01 3 1 2019-05-02 0.33 LMUS02 2 0 2019-05-07 0
So I tried this so far:
WITH t1 AS( SELECT the_debt_id, the_debt_due, the_debt_paid, CASE WHEN the_debt_paid::date > the_debt_due THEN 1 ELSE 0 END AS tardiness FROM my_table), t2 AS( SELECT the_debt_id, sum(tardiness) AS tardiness, count(the_debt_id) AS payments, first_value(the_debt_due) FROM t1 GROUP BY the_debt_id), t3 AS( SELECT *, tardiness/payments::float AS percentage FROM t2) SELECT * FROM t3
I get an error where it says I need an OVER clause, which means that I need a partition but I’m not sure how to combine GROUP BY and PARTITION. Any help will be greatly appreciated.
Advertisement
Answer
Aggregation seems appropriate:
select the_debt_id, count(*) as payments, count(*) filter (where the_debt_paid::date > the_debt_due) as num_tardy, min(the_debt_due) as first_due_date, avg( (the_debt_paid::date > the_debt_due)::int ) as tardy_ratio from my_table t group by the_debt_id;
Here is a db<>fiddle.