I have some monthly data as below:
Month | Category | Monthly Value 2020-07-01| Food | 1 2020-07-01| Entertainment | 4 2020-08-01| Entertainment | 2 2020-09-01| Entertainment | 1
I want to calculate the cumulative sum for each Category and get the result as below:
Month | Category | Cumulative Sum 2020-07-01 | Food | 1 2020-08-01 | Food | 1 2020-09-01 | Food | 1 2020-07-01 | Entertainment | 4 2020-08-01 | Entertainment | 6 2020-09-01 | Entertainment | 7
I’m writing the window sum query as below:
SELECT month , category , sum("monthly value") OVER (PARTITION BY "category" ORDER BY "month" ASC ROWS UNBOUNDED PRECEDING) AS "Cumulative Sum" from ( select date_trunc('month', daily_date) as month, category, sum(daily_value) as "monthly value" from sample_table group by date_trunc('month', daily_date) as month, category)
But, I’m getting as follows:
Month | Category | Cumulative Sum 2020-07-01 | Food | 1 2020-07-01 | Entertainment | 4 2020-08-01 | Entertainment | 6 2020-09-01 | Entertainment | 7
Why is “Food” Category’s Cumulative Sum not showing up for the months of 2020-08-01
and 2020-09-01
? How can I make the result be displayed as expected (shown in the 2nd table).
I’m using Redshift btw. Thanks!
Advertisement
Answer
Use a cross join
to generate the rows and then left join
to bring in the values:
select m.month, c.category, t.monthly_value, sum(t.monthly_value) over (partition by c.category order by m.month) as running_monthly_value from (select distinct month from t) m cross join (select distinct category from t) c left join t on t.month = m.month and t.category = c.category;