Skip to content
Advertisement

SQL Server – Aggregate data by minute over multiple days

Context

I’m using Microsoft SQL Server 2016.

There is a database table “Raw_data”, that contains the status of a machine, together with it’s starting time. There are several machines and each one writes it’s status to the database multiple times per minute.

To reduce the data volume I’m trying to aggregate the data into 1-Minute chunks to save it for further analysis. Due to a capacity constraint, I want to execute this transition-logic every few minutes (e.g. scheduled SQL Server Agent Job), delete the raw data and just keep the aggregated data.

To simplify the example, let’s assume “Raw_data” looks something like this:

Also there are database tables “Dim_date” and “Dim_time”, that look something like that:

The result should look like this:


Attempt

To calculate the duration of each status per minute I used an CTE and LEAD to fetch the starting date and time from the next status in the database table, then joined with the dimension tables and aggregated the result.

The Problem

If the status lasts past midnight it won’t be aggregated correctly. For example the status at id = 3 in “Raw_data” starts at 23:07 and ends on 00:20 the next day. Here, timekey is greater than end_timekey, so the status get’s excluded from the resulting table by the filter TIMEKEY < END_TIMEKEY. I haven’t come up with a solution on how to change the join-condition to include such long-lasting states, but get the expected result.

PS: I already wrote, that normally status-updates are happening every several seconds. Thus, the problem only occurs in edge cases, e.g. if a machine get’s turned off.


Solution

Unfortunately I did not receive an answer on how to get the expected result using the date- and time dimension tables. But dnoeth’s approach using a recursive CTE is good, so I went with it:

Advertisement

Answer

This is a use-case for a recursive CTE, increasing created_at by one minute per recursion:

See fiddle

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement