Skip to content
Advertisement

How to calculate rolling timestamp sum from table partitioned by specific column? – SQL

I have a table with a series of timelines that are normalized starting from 00:00:00.00000. I want to summate them sequentially and stitch them together based on my order_key value.

Sample Data:

id    order_key  start_time         end_time           activity_type
1     1          00:00:00.00000     00:00:00.01000     A
1     1          00:00:00.01000     00:00:00.02000     B
1     1          00:00:00.02000     00:00:00.03000     C
1     1          00:00:00.03000     00:00:00.03500     A
1     2          00:00:00.00000     00:00:00.01500     A
1     2          00:00:00.01500     00:00:00.04500     B
1     3          00:00:00.00000     00:00:00.05500     B

Desired Output:

id  start_time         end_time           activity_type
1   00:00:00.00000     00:00:00.01000     A
1   00:00:00.01000     00:00:00.02000     B
1   00:00:00.02000     00:00:00.03000     C
1   00:00:00.03000     00:00:00.03500     A
1   00:00:00.03500     00:00:00.05000     A
1   00:00:00.05000     00:00:00.08000     B
1   00:00:00.08000     00:00:00.13500     B

My Attempt:

SELECT 
   id
 , -- CASE WHEN new order_key THEN LAG(end_time) OVER (PARTITION BY id ORDER BY snap_view_index, start_time) ELSE start_time END AS start_time
 , -- CASE WHEN new order_key THEN TIME_ADD(LAG(end_time), INTERVAL TIME_DIFF(end_time, start_time, MILLISECOND) MILLISECOND) ELSE TIME_ADD(start_time, INTERVAL TIME_DIFF(end_time, start_time, MILLISECOND) MILLISECOND)
, activity_type
FROM my_table; 

Advertisement

Answer

Consider below query:

Recursive Approach
WITH RECURSIVE t AS (
  SELECT ROW_NUMBER() OVER (PARTITION BY id ORDER BY order_key, start_time) pos, * 
    FROM sample_table
),
r AS (
  SELECT * FROM t WHERE pos = 1
   UNION ALL
  SELECT t.* REPLACE (
           r.end_time AS start_time, 
           TIME_ADD(r.end_time, INTERVAL TIME_DIFF(t.end_time, t.start_time, MILLISECOND) MILLISECOND) AS end_time
         )
    FROM r JOIN t ON t.pos = r.pos + 1
)
SELECT * EXCEPT(pos, order_key) FROM r ORDER BY pos;
Non-recursive Approach
WITH diff_by_key AS (
  SELECT id, order_key,
         SUM(TIME_DIFF(MAX(end_time), MIN(start_time), MILLISECOND)) OVER w AS cum_diff
    FROM sample_table GROUP BY id, order_key
  WINDOW w AS (PARTITION BY id ORDER BY order_key)
)
SELECT s.*
       REPLACE(
         TIME_ADD(start_time, INTERVAL IFNULL(cum_diff, 0) MILLISECOND) AS start_time,
         TIME_ADD(end_time, INTERVAL IFNULL(cum_diff, 0) MILLISECOND) AS end_time
       )
  FROM sample_table s LEFT JOIN diff_by_key d ON s.id = d.id AND s.order_key = d.order_key + 1
 ORDER BY order_key, start_time;

enter image description here

User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement