[Query and question edited and fixed thanks to comments from @Gordon Linoff and @shawnt00]
I recently inherited a SQL query that calculates the number of some events in time windows of 30 days from a log database. It uses a CTE (Common Table Expression) to generate the 30 days ranges since ‘2019-01-01’ to now. And then it counts the cases in those 30/60/90 days intervals. I am not sure this is the best method. All I know is that it takes a long time to run and I do not understand 100% how exactly it works. So I am trying to rebuild it in an efficient way (maybe as it is now is the most efficient way, I do not know).
I have several questions:
- One of the things I notice is that instead of using DATEDIFF the query simply substracts a number of days from the date.Is that a good practice at all?
- Is there a better way of doing the time comparisons?
- Is there a better way to do the whole thing? The bottom line is: I need to aggregate data by number of occurrences in time periods of 30, 60 and 90 days.
Note: LogDate original format is like 2019-04-01 18:30:12.000.
DECLARE @dt1 Datetime='2019-01-01' DECLARE @dt2 Datetime=getDate(); WITH ctedaterange AS (SELECT [Dates]=@dt1 UNION ALL SELECT [dates] + 30 FROM ctedaterange WHERE [dates] + 30<= @dt2) SELECT [dates], lt.Activity, COUNT(*) as Total, SUM(CASE WHEN lt.LogDate <= dates and lt.LogDate > dates - 90 THEN 1 ELSE 0 END) AS Activity90days, SUM(CASE WHEN lt.LogDate <= dates and lt.LogDate > dates - 60 THEN 1 ELSE 0 END) AS Activity60days, SUM(CASE WHEN lt.LogDate <= dates and lt.LogDate > dates - 30 THEN 1 ELSE 0 END) AS Activity30days FROM ctedaterange AS cte JOIN (SELECT Activity, CONVERT(DATE, LogDate) as LogDate FROM LogTable) AS lt ON cte.[dates] = lt.LogDate group by [dates], lt.Activity OPTION (maxrecursion 0)
Sample dataset (LogTable):
LogDate, Activity 2020-02-25 01:10:10.000, Activity01 2020-04-14 01:12:10.000, Activity02 2020-08-18 02:03:53.000, Activity02 2019-10-29 12:25:55.000, Activity01 2019-12-24 18:11:11.000, Activity03 2019-04-02 03:33:09.000, Activity01
Expected Output (the output does not reflect the data shown above for I would need too many lines in the sample set to be shown in this post)
As I said above, the bottom line is: I need to aggregate data by number of occurrences in time periods of 30, 60 and 90 days.
Activity, Activity90days, Activity60days, Activity30days Activity01, 3, 0, 1 Activity02, 1, 10, 2 Activity03, 5, 1, 3
Thank you for any suggestion.
Advertisement
Answer
SQL Server doesn’t yet have the option to range over values of the window frame of an analytic function. Since you’ve generated all possible dates though and you’ve already got the counts by date, it’s very easy to look back a specific number of (aggregated) rows to get the right totals. Here is my suggested expression for 90 days:
sum(count(LogDate)) over ( partition by Activity order by [dates] with rows between 89 preceding and current row )