I am trying to calculate monthly balances of bank accounts from the following postgresql table, containing transactions:
# d transactions
View "public.transactions"
Column | Type | Collation | Nullable | Default
--------+------------------+-----------+----------+---------
year | double precision | | |
month | double precision | | |
bank | text | | |
amount | numeric | | |
In “rolling sum” I mean that the sum should contain the sum of all transactions until the end of the given month from the beginning of time, not just all transactions in thegiven month.
I came up with the following query:
select
a.year, a.month, a.bank,
(select sum(b.amount) from transactions b
where b.year < a.year
or (b.year = a.year and b.month <= a.month))
from
transactions a
order by
bank, year, month;
The problem is that this contains as many rows for each of the months for each banks as many transactions were there. If more, then more, if none, then none. I would like a query which contains exactly one row for each bank and month for the whole time interval including the first and last transaction.
How to do that?
An example dataset and a query can be found at https://rextester.com/WJP53830 , courtesy of @a_horse_with_no_name
Advertisement
Answer
You need to generate a list of months first, then you can outer join your transactions table to that list.
with all_years as (
select y.year, m.month, b.bank
from generate_series(2010, 2019) as y(year) --<< adjust here for your desired range of years
cross join generate_series(1,12) as m(month)
cross join (select distinct bank from transactions) as b(bank)
)
select ay.*, sum(amount) over (partition by ay.bank order by ay.year, ay.month)
from all_years ay
left join transactions t on (ay.year, ay.month, ay.bank) = (t.year::int, t.month::int, t.bank)
order by bank, year, month;
The cross join with all banks is necessary so that the all_years
CTE will also contain a bank for each month row.
Online example: https://rextester.com/ZZBVM16426