Skip to content
Advertisement

Interpolate Multiseries Data In SQL

I have a system that stores the data only when they are changed. So, the dataset looks like below.

data_type_id data_value inserted_at
2 240 2022-01-19 17:20:52
1 30 2022-01-19 17:20:47
2 239 2022-01-19 17:20:42
1 29 2022-01-19 17:20:42

My data frequency is every 5 seconds. So, whether there’s any timestamp or not I need to get the result by assuming in this 5th-second data value the same as the previous value.

As I am storing the data that are only changed, indeed the dataset should be like below.

data_type_id data_value inserted_at
2 240 2022-01-19 17:20:52
1 30 2022-01-19 17:20:52
2 239 2022-01-19 17:20:47
1 30 2022-01-19 17:20:47
2 239 2022-01-19 17:20:42
1 29 2022-01-19 17:20:42

I don’t want to insert into my table, I just want to retrieve the data like this on the SELECT statement.

Is there any way I can create this query?

PS. I have many data_types hence when the OP makes a query, it usually gets around a million rows.

EDIT: Information about server Server version: 10.3.27-MariaDB-0+deb10u1 Debian 10

The User is going to determine the SELECT DateTime. So, there’s no certain between time.

As @Akina mentioned, sometimes there’re some gaps between the inserted_at. The difference might be ~4seconds or ~6seconds instead of a certain 5seconds. Since it’s not going to happen so frequently, It is okay to generate by ignoring this fact.

Advertisement

Answer

With the help of a query that gets you all the combinations of data_type_id and the 5-second moments you need, you can achieve the result you need using a subquery that gets you the closest data_value:

Fiddle

You can replace the recursive CTE with any query that gets you all the 5-second moments you need.

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement