Skip to content
Advertisement

Calculate Cumulative Sum-Running Total/Sale for Each Day of the Month in Redshift

Problem Statement: Get running sales for every calendar date for every product.

Brief Background: Entries are added to the sales table when sales is made. If a particular product has not made any sales for a particular date, no record will be inserted.

Sales Table Structure.

+------------+--------------+------------+
| date       | Product Code | total_sale |
+------------+--------------+------------+
| 2020-01-15 |        abc   |        100 |
| 2020-01-16 |        abc   |        200 |
| 2020-01-17 |        abc   |        200 |
| 2020-01-16 |        tvc   |        200 |
| 2020-01-16 |        sfr   |        200 |
+------------+--------------+------------+

SQL to generate the above view.

    create temporary table sales_daily as 
    select '20200115' :: date as sales_day, 'abc' as product_Code , 100 as sales
    Union all 
    select '20200116' :: date as sales_day, 'abc' as product_Code , 200 as sales
    Union all 
    select '20200117' :: date as sales_day, 'abc' as product_Code , 200 as sales
    Union all 
    select '20200115' :: date as sales_day, 'tvc' as product_Code , 200 as sales
    Union all 
    select '20200115' :: date as sales_day, 'sfr' as product_Code , 200 as sales
    ;
    
    select * from sales_Daily; 

Expected Output: For every single calendar day of the month (In this case JAN 2020) get rolling sales for last n days(this number can be anything,and will be harcoded in final query).

Steps Taken: I have tried existing calendar table(snippet to create calendar table shared below) for this purpose and sum window function.However, as rolling sum needs to be calculated for each product code using partition by product code sums up at product level not day level. I understand this is the expected behaviour. My ask here is what should be the approach when using redshift to solve this problem statement. Also, is it a problem statement that can be solved using window function.

SQL to create calendar table:

create temporary table calendar as 
select '20200101' :: date As calendar_day Union all
select '20200102' :: date As calendar_day Union all
select '20200103' :: date As calendar_day Union all
select '20200104' :: date As calendar_day Union all
select '20200105' :: date As calendar_day Union all
select '20200106' :: date As calendar_day Union all
select '20200107' :: date As calendar_day Union all
select '20200108' :: date As calendar_day Union all
select '20200109' :: date As calendar_day Union all
select '20200110' :: date As calendar_day Union all
select '20200111' :: date As calendar_day Union all
select '20200112' :: date As calendar_day Union all
select '20200113' :: date As calendar_day Union all
select '20200114' :: date As calendar_day Union all
select '20200115' :: date As calendar_day Union all
select '20200116' :: date As calendar_day Union all
select '20200117' :: date As calendar_day Union all
select '20200118' :: date As calendar_day Union all
select '20200119' :: date As calendar_day Union all
select '20200120' :: date As calendar_day Union all
select '20200121' :: date As calendar_day Union all
select '20200122' :: date As calendar_day Union all
select '20200123' :: date As calendar_day Union all
select '20200124' :: date As calendar_day Union all
select '20200125' :: date As calendar_day Union all
select '20200126' :: date As calendar_day Union all
select '20200127' :: date As calendar_day Union all
select '20200128' :: date As calendar_day Union all
select '20200129' :: date As calendar_day Union all
select '20200130' :: date As calendar_day Union all
select '20200131' :: date As calendar_day 
;

SQL used as of now for final Output:

select 
calendar_Day,
sales_day,
product_Code,
sales,
sum(sales) over (partition by product_Code order by calendar_Day  rows between 1 PRECEDING and current row) running_salest1day
from  calendar
left join sales_daily on calendar_day :: date =  sales_day :: date

Advertisement

Answer

You will have to use window function for cumulative sum. The query below creates dummy 0 valued sales for each product and for each calendar day. This will ensure the output has cumulative sales value row for each product and each calendar day.

select sales_day as Calendar_Day,
       product_Code,
       sales, 
       sum(sales) over (partition by product_Code order by sales_day  rows between 1 PRECEDING and current row) running_salest1day
from (
  select sales_day, product_Code, sum(sales) as sales
  from (
    select sales_day, product_Code, sales -- Actual Sales entry
    from sales_daily
    union all
    select calendar_day as sales_day, dp.product_Code, 0 as sales -- Dummy Sales entry for each date
    from (
      select distinct product_Code from sales_daily 
    ) dp cross join calendar
  ) sd
  group by sales_day, product_Code
) asd

Here is a SQL Fiddle.

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement