I am trying to calculate monthly balances of bank accounts from the following postgresql table, containing transactions:
# d transactions View "public.transactions" Column | Type | Collation | Nullable | Default --------+------------------+-----------+----------+--------- year | double precision | | | month | double precision | | | bank | text | | | amount | numeric | | |
In “rolling sum” I mean that the sum should contain the sum of all transactions until the end of the given month from the beginning of time, not just all transactions in thegiven month.
I came up with the following query:
select a.year, a.month, a.bank, (select sum(b.amount) from transactions b where b.year < a.year or (b.year = a.year and b.month <= a.month)) from transactions a order by bank, year, month;
The problem is that this contains as many rows for each of the months for each banks as many transactions were there. If more, then more, if none, then none. I would like a query which contains exactly one row for each bank and month for the whole time interval including the first and last transaction.
How to do that?
An example dataset and a query can be found at https://rextester.com/WJP53830 , courtesy of @a_horse_with_no_name
Advertisement
Answer
You need to generate a list of months first, then you can outer join your transactions table to that list.
with all_years as ( select y.year, m.month, b.bank from generate_series(2010, 2019) as y(year) --<< adjust here for your desired range of years cross join generate_series(1,12) as m(month) cross join (select distinct bank from transactions) as b(bank) ) select ay.*, sum(amount) over (partition by ay.bank order by ay.year, ay.month) from all_years ay left join transactions t on (ay.year, ay.month, ay.bank) = (t.year::int, t.month::int, t.bank) order by bank, year, month;
The cross join with all banks is necessary so that the all_years
CTE will also contain a bank for each month row.
Online example: https://rextester.com/ZZBVM16426