You may be aware of rolling the results of an aggregate over a specific number of preceding rows. I.e.: how many hot dogs did I eat over the last 7 days
SELECT HotDogCount, DateKey, SUM(HotDogCount) OVER (ORDER BY DateKey ROWS 6 PRECEDING) AS HotDogsLast7Days FROM dbo.HotDogConsumption
Results:
+-------------+------------+------------------+ | HotDogCount | DateKey | HotDogsLast7Days | +-------------+------------+------------------+ | 3 | 09/21/2020 | 3 | | 2 | 9/22/2020 | 5 | | 1 | 09/23/2020 | 6 | | 1 | 09/24/2020 | 7 | | 1 | 09/25/2020 | 8 | | 4 | 09/26/2020 | 12 | | 1 | 09/27/2020 | 13 | | 3 | 09/28/2020 | 13 | | 2 | 09/29/2020 | 13 | | 1 | 09/30/2020 | 13 | +-------------+------------+------------------+
Now, the problem I am having is when there are gaps in the dates. So, basically, one day my intestines and circulatory system are screaming at me: “What the heck are you doing, you’re going to kill us all!!!” So, I decide to give my body a break for a day and now there is no record for that day. When I use the “ROWS 6 PRECEDING” method, I will now be reaching back 8 days, rather than 7, because one day was missed.
So, the question is, do any of you know how I could use the OVER clause to truly use a date value (something like “DATEADD(day,-7,DateKey)”) to determine how many previous rows should be summed up for a true 7 day rolling sum, regardless of whether I only ate hot dogs on one day or on all 7 days?
Side note, to have a record of 0 for the days I didn’t eat any hotdogs is not an option. I understand that I could use an array of dates and left join to it and do a
CASE WHEN Datekey IS NULL THEN 0 END
type of deal, but I would like to find out if there is a different way where the rows preceding value can somehow be determined dynamically based on the date.
Advertisement
Answer
Window functions are the right approach in theory. But to look back at the 7 preceding days (not rows), we need a range
frame specification – which, unfornately, SQL Server does not support.
I am going to recommend a subquery, or a lateral join:
select hdc.*, hdc1.* from dbo.HotDogConsumption hdc cross apply ( select coalesce(sum(HotDogCount), 0) HotDogsLast7Days from dbo.HotDogConsumption hdc1 where hdc1.datekey >= dateadd(day, -7, hdc.datekey) and hdc1.datekey < hdc.datekey ) hdc1
You might want to adjust the conditions in the where
clause of the subquery to the precise frame that you want. The above code computes over the last 7 days, not including today. Something equivalent to your current attempt would be like:
where hdc1.datekey >= dateadd(day, -6, hdc.datekey) and hdc1.datekey <= hdc.datekey