Skip to content
Advertisement

SQL Server filling gaps in time series

I’m having trouble with a request in SQL.

I have two tables, one representing a vector of dates and another timeseries of prices for different securities:

Dates:

DateId Date
1 2021-01-01
2 2021-01-02
3 2021-01-03

TimeSerie:

SecurityId DateId Value
1 1 0.25
1 3 0.32
2 1 0.41
2 2 0.67

The timeserie may have gaps, for example in the above table the SecurityId=1 doesn’t have any row at DateId=2 and SecurityId=2 doesn’t have any rows at DateId=3.

I need to build a request that fill the gaps with the last previous value in the timeserie. Resulting in having a value for all securities and all dates as follow:

SecurityId DateId Value
1 1 0.25
1 2 0.25
1 3 0.32
2 1 0.41
2 2 0.67
2 3 0.67

I tried to start matching all dates with a right join

 SELECT  * from [TimeSerie] px RIGHT JOIN Dates dt on dt.DateId = px.Dateid

Unfortunately this doesn’t work as there is always a security for which the equality on DateId matches, so I don’t get the TimeSeries rows having no values.

I’m working on SQL Server standard 2019 and I’am aiming to a solution based on a single query (avoiding procedures with temp tables or cursors).

Advertisement

Answer

One method would be to use CTEs and a windowed COUNT to put the data into groups and then get the MAX value of that group:

--Sample data
WITH Dates AS(
    SELECT *
    FROM (VALUES(1,CONVERT(date,'2021-01-01')),
                (2,CONVERT(date,'2021-01-02')),
                (3,CONVERT(date,'2021-01-03')))D(DateID,[Date])),
TimeSerie AS(
    SELECT *
    FROM (VALUES(1,1,0.25),
                (1,3,0.32),
                (2,1,0.41),
                (2,2,0.67))V(SecurityID,DateID,[Value])),
--Solution
DateSeries AS(
    SELECT DISTINCT
           D.DateID,
           D.[Date],
           TS.SecurityID
    FROM Dates D
         CROSS JOIN TimeSerie TS),
Groups AS(
    SELECT DS.SecurityID,
           DS.DateID,
           TS.[value],
           COUNT(TS.[Value]) OVER (PARTITION BY DS.SecurityID ORDER BY [Date]) AS Grp
    FROM DateSeries DS
         LEFT JOIN TimeSerie TS ON DS.SecurityID = TS.SecurityID
                               AND DS.DateID = TS.DateID)
SELECT G.SecurityID,
       G.DateID,
       MAX([Value]) OVER (PARTITION BY G.SecurityID, G.Grp) AS [Value]
FROM Groups G;
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement