Skip to content
Advertisement

SQL rolling average that restarts on gaps

I have values that come in every 1 hour and I need to do an 8-hr rolling average. The catch is that this rolling average has to “restart” when there is a gap.

Please see the table below (my desired output), as you can see, the value for 14:45 is missing, so the average for 15:45 is that row’s Scaled.

Then the 16:45, 17:45, 18:45 and 19:45 values are missing, so the value for 20:45 is that row’s Scaled.

For 21:45 is the avg between 20:45 and 21:45.

For 22:45 is the avg between 20:45, 21:45 and 22:45.

And so on…

StartDate Scaled Rolling Average
2021-01-28 00:45:00.000 10.589 10.589
2021-01-28 01:45:00.000 9.989 10.289000000000001
2021-01-28 02:45:00.000 10.512 10.363333333333335
2021-01-28 03:45:00.000 10.22 10.3275
2021-01-28 04:45:00.000 13.23 10.908000000000001
2021-01-28 05:45:00.000 14.516 11.509333333333336
2021-01-28 06:45:00.000 15.687 12.106142857142858
2021-01-28 07:45:00.000 14.316 12.382375000000001
2021-01-28 08:45:00.000 16.888 13.169750000000002
2021-01-28 09:45:00.000 24.58 14.993625000000002
2021-01-28 10:45:00.000 24.349 16.72325
2021-01-28 11:45:00.000 22.832 18.29975
2021-01-28 12:45:00.000 26.166 19.91675
2021-01-28 13:45:00.000 27.437 21.531875
2021-01-28 15:45:00.000 22.424 22.424
2021-01-28 20:45:00.000 19.629 19.629
2021-01-28 21:45:00.000 21.431 20.53
2021-01-28 22:45:00.000 22.07 21.04333333

I need this to go into a view, so I can’t use variables.

I cannot find a way to do it, so any help will be greatly appreciated.

Thanks!

Advertisement

Answer

You might find the brute force approach is simplest:

select t.*, v.avg_scaled
from (select t.*,
             lag(scaled, 1) over (order by startdate) as scaled_1,
             lag(scaled, 2) over (order by startdate) as scaled_2,
             . . .
             lag(startdate, 1) over (order by startdate) as startdate_1,
             lag(startdate, 2) over (order by startdate) as startdate_2,
             . . .
             
      from t
     ) t cross apply
     (select avg(v.scaled) as avg_scaled
      from (values (0, t.scaled, t.startdate),
                   (1, t.scaled_1, t.startdate_1),
                   (2, t.scaled_2, t.startdate_2),
                   . . .
           ) v(n, scaled, startdate)
       where datediff(hour, v.start_date, t.startdate) = v.n
     ) v;
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement