I have a data table with timestamp and some data columns. The rows are inserted in arbitrary time intervals, i.e. the timestamp difference of consecutive rows is not stable but ranges from several seconds to several hours.
I need to select one row per time interval of constant length, if there is any.
Example
To get the time intervals for selection, I have a function that generates table of time intervals, for example:
start_time end_time '2021-05-31 10:00:00' '2021-05-31 10:10:00' '2021-05-31 10:10:00' '2021-05-31 10:20:00' '2021-05-31 10:20:00' '2021-05-31 10:30:00'
Then for this source data:
timestamp data '2021-05-31 10:01:00' 1 '2021-05-31 10:02:00' 2 '2021-05-31 10:05:00' 3 '2021-05-31 10:21:00' 4
I require result:
timestamp data '2021-05-31 10:01:00' 1 '2021-05-31 10:21:00' 4
- the first row out of three is selected for the first interval
- no row is selected for the second interval
- the first and only possible row is selected for the third interval
Thanks
Advertisement
Answer
You can use a lateral join to pick one row per interval:
select * from get_intervals() i cross join lateral ( select * from my_table t where t.timestamp >= i.start_time and t.timestamp < i.end_time order by t.timestamp -- or by something random? fetch first row only ) d;
(A lateral join is the first thing that comes to mind. To join all rows first and then use DISTINCT ON
instead, could lead to a large intermediate result, so I would prefer the lateral join over that approach.)