Skip to content
Advertisement

SQL monthly rolling sum

I am trying to calculate monthly balances of bank accounts from the following postgresql table, containing transactions:

# d transactions
                 View "public.transactions"
 Column |       Type       | Collation | Nullable | Default 
--------+------------------+-----------+----------+---------
 year   | double precision |           |          | 
 month  | double precision |           |          | 
 bank   | text             |           |          | 
 amount | numeric          |           |          | 

In “rolling sum” I mean that the sum should contain the sum of all transactions until the end of the given month from the beginning of time, not just all transactions in thegiven month.

I came up with the following query:

select 
    a.year, a.month, a.bank, 
    (select sum(b.amount) from transactions b 
     where b.year < a.year 
        or (b.year = a.year and b.month <= a.month)) 
from 
    transactions a 
order by 
    bank, year, month;

The problem is that this contains as many rows for each of the months for each banks as many transactions were there. If more, then more, if none, then none. I would like a query which contains exactly one row for each bank and month for the whole time interval including the first and last transaction.

How to do that?

An example dataset and a query can be found at https://rextester.com/WJP53830 , courtesy of @a_horse_with_no_name

Advertisement

Answer

You need to generate a list of months first, then you can outer join your transactions table to that list.

with all_years as (
  select y.year, m.month, b.bank
  from generate_series(2010, 2019) as y(year) --<< adjust here for your desired range of years
    cross join generate_series(1,12) as m(month)
    cross join (select distinct bank from transactions) as b(bank)
)
select ay.*, sum(amount) over (partition by ay.bank order by ay.year, ay.month)
from all_years ay
  left join transactions t on (ay.year, ay.month, ay.bank) = (t.year::int, t.month::int, t.bank)
order by bank, year, month;  

The cross join with all banks is necessary so that the all_years CTE will also contain a bank for each month row.

Online example: https://rextester.com/ZZBVM16426

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement