I have a table where I register a debt and the paid date:
x
CREATE TABLE my_table
(
the_debt_id varchar(6) NOT NULL,
the_debt_paid timestamp NOT NULL,
the_debt_due date NOT NULL
)
INSERT INTO my_table
VALUES ('LMUS01', '2019-05-02 09:00:01', '2019-05-02'),
('LMUS01', '2019-06-03 10:45:12', '2019-06-02'),
('LMUS01', '2019-07-01 15:39:58', '2019-07-02'),
('LMUS02', '2019-05-03 19:43:44', '2019-05-07'),
('LMUS02', '2019-06-07 08:37:05', '2019-06-07')
What I want is to aggregate this data per debt_id, payments (the quantity of payments per debt_id), tardiness (if the paid_date > due_date), the first due_date per debt_id and the percentage that each debt was late. This table should give the idea:
the_debt_id payments tardiness first_due_date percentage
LMUS01 3 1 2019-05-02 0.33
LMUS02 2 0 2019-05-07 0
So I tried this so far:
WITH t1 AS(
SELECT the_debt_id, the_debt_due, the_debt_paid,
CASE
WHEN the_debt_paid::date > the_debt_due THEN 1
ELSE 0
END AS tardiness
FROM my_table),
t2 AS(
SELECT the_debt_id,
sum(tardiness) AS tardiness,
count(the_debt_id) AS payments,
first_value(the_debt_due)
FROM t1
GROUP BY the_debt_id),
t3 AS(
SELECT *,
tardiness/payments::float AS percentage
FROM t2)
SELECT * FROM t3
I get an error where it says I need an OVER clause, which means that I need a partition but I’m not sure how to combine GROUP BY and PARTITION. Any help will be greatly appreciated.
Advertisement
Answer
Aggregation seems appropriate:
select the_debt_id,
count(*) as payments,
count(*) filter (where the_debt_paid::date > the_debt_due) as num_tardy,
min(the_debt_due) as first_due_date,
avg( (the_debt_paid::date > the_debt_due)::int ) as tardy_ratio
from my_table t
group by the_debt_id;
Here is a db<>fiddle.