Skip to content
Advertisement

SQL Function to Look at near-by Intervals of data

I am trying to determine if there is some sort of SQL function that would allow me to look at nearby intervals of data to aggregate them into other aggregate functions.

My current data looks something like the following and occurs every 5 minutes with a single entry per day for a given time period per entity.

  • EntityA 20200201 00:10:00 21.0
  • EntityA 20200201 00:15:00 23.0
  • EntityA 20200201 00:20:00 23.0
  • EntityA 20200202 00:15:00 22.1
  • EntityA 20200203 00:15:00 13.3

I have several aggregate functions looking at the entire series of data for the time frame already and I feed it a parameter to tell it what time of day I want to look at as simply “00:15” as a parameter (declared as datetime) which via the join Facts.Timekey = 4 which is “00:15” as its the 4th 5 minute interval of the day.

–Abbreviated and simple function assuming only a single entity.

DECLARE @TimeFrame DateTime = '00:15:00'
(SELECT stdevp(convert(decimal(19,6), facts.ActualValue))
                                   FROM FactsDTValue as Facts
                                   LEFT JOIN DimTime DT on Facts.TimeKey = DT.TimeKey
                                   Where DT.TimeName24 = @TimeFrame 
                                   Group by Fact.Entity);

What I wish to do for function above is simple function that can be applied on a larger scale to be a range of timeframes around 00:15.

So effectively if the parameter is pass 00:15 and say I had another parameter set to say 3 it would get the values of facts.ActualValue for each DAY in the series.. not just the single value to perform the aggregate function like STDDEVP, AVERAGE, MAX, MIN, Median, Mode, etc.

  1. 00:00 (-3)
  2. 00:05 (-2)
  3. 00:10 (-1)
  4. 00:15 (0)
  5. 00:20 (+1)
  6. 00:25 (+2)
  7. 00:30 (+3)

So effectively for each day sampling 7 values (when parameter is set to 3) vs. just 1.

If it helps at all is I needing to do this to handle edge values but also NULLs as if there is no data received for an entity the period simply doesn’t exist.

I am using MS SQL Server 2016; What is the best way to approach this and is there a function that might do this automatically or am I possibly just missing something really simple?

Advertisement

Answer

You could do this with window functions. The logic is to enumerate the rows for each entity and for each day with row_number(), then identify the row number of the row in each partition whose time matches the parameter. You can then use that information for filtering.

Assuming that the columns of your table are entity (the entity), ts (the datetime column) and val (the value), that would be:

select entity, cast(ts as day) ts_day, stdevp(val) val_stdevp
from (
    select 
        t.*,
        max(case when cast(ts as time) = @TimeFrame then rn end)
            over(partition by entity, cast(ts as date)) rn0
    from (
        select 
            t.*,
            row_number() over(partition by entity, cast(ts as date) order by ts) rn
        from mytbale t
    ) t
) t
where rn between rn0 - 3 and rn0 + 3
group by entity, cast(ts as day)

This gives you one record per entity and per day, that holds the stdevp() of the value for the 7 target records.

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement