Problem Statement: Get running sales for every calendar date for every product.
Brief Background: Entries are added to the sales table when sales is made. If a particular product has not made any sales for a particular date, no record will be inserted.
Sales Table Structure.
+------------+--------------+------------+ | date | Product Code | total_sale | +------------+--------------+------------+ | 2020-01-15 | abc | 100 | | 2020-01-16 | abc | 200 | | 2020-01-17 | abc | 200 | | 2020-01-16 | tvc | 200 | | 2020-01-16 | sfr | 200 | +------------+--------------+------------+
SQL to generate the above view.
create temporary table sales_daily as select '20200115' :: date as sales_day, 'abc' as product_Code , 100 as sales Union all select '20200116' :: date as sales_day, 'abc' as product_Code , 200 as sales Union all select '20200117' :: date as sales_day, 'abc' as product_Code , 200 as sales Union all select '20200115' :: date as sales_day, 'tvc' as product_Code , 200 as sales Union all select '20200115' :: date as sales_day, 'sfr' as product_Code , 200 as sales ; select * from sales_Daily;
Expected Output: For every single calendar day of the month (In this case JAN 2020) get rolling sales for last n days(this number can be anything,and will be harcoded in final query).
Steps Taken: I have tried existing calendar table(snippet to create calendar table shared below) for this purpose and sum window function.However, as rolling sum needs to be calculated for each product code using partition by product code sums up at product level not day level. I understand this is the expected behaviour. My ask here is what should be the approach when using redshift to solve this problem statement. Also, is it a problem statement that can be solved using window function.
SQL to create calendar table:
create temporary table calendar as select '20200101' :: date As calendar_day Union all select '20200102' :: date As calendar_day Union all select '20200103' :: date As calendar_day Union all select '20200104' :: date As calendar_day Union all select '20200105' :: date As calendar_day Union all select '20200106' :: date As calendar_day Union all select '20200107' :: date As calendar_day Union all select '20200108' :: date As calendar_day Union all select '20200109' :: date As calendar_day Union all select '20200110' :: date As calendar_day Union all select '20200111' :: date As calendar_day Union all select '20200112' :: date As calendar_day Union all select '20200113' :: date As calendar_day Union all select '20200114' :: date As calendar_day Union all select '20200115' :: date As calendar_day Union all select '20200116' :: date As calendar_day Union all select '20200117' :: date As calendar_day Union all select '20200118' :: date As calendar_day Union all select '20200119' :: date As calendar_day Union all select '20200120' :: date As calendar_day Union all select '20200121' :: date As calendar_day Union all select '20200122' :: date As calendar_day Union all select '20200123' :: date As calendar_day Union all select '20200124' :: date As calendar_day Union all select '20200125' :: date As calendar_day Union all select '20200126' :: date As calendar_day Union all select '20200127' :: date As calendar_day Union all select '20200128' :: date As calendar_day Union all select '20200129' :: date As calendar_day Union all select '20200130' :: date As calendar_day Union all select '20200131' :: date As calendar_day ;
SQL used as of now for final Output:
select calendar_Day, sales_day, product_Code, sales, sum(sales) over (partition by product_Code order by calendar_Day rows between 1 PRECEDING and current row) running_salest1day from calendar left join sales_daily on calendar_day :: date = sales_day :: date
Advertisement
Answer
You will have to use window function for cumulative sum. The query below creates dummy 0 valued sales for each product and for each calendar day. This will ensure the output has cumulative sales value row for each product and each calendar day.
select sales_day as Calendar_Day, product_Code, sales, sum(sales) over (partition by product_Code order by sales_day rows between 1 PRECEDING and current row) running_salest1day from ( select sales_day, product_Code, sum(sales) as sales from ( select sales_day, product_Code, sales -- Actual Sales entry from sales_daily union all select calendar_day as sales_day, dp.product_Code, 0 as sales -- Dummy Sales entry for each date from ( select distinct product_Code from sales_daily ) dp cross join calendar ) sd group by sales_day, product_Code ) asd
Here is a SQL Fiddle.