I have a POSTGRES table with roughly the following shape:
sensor_id | recorded_at ----------+-------------------- A | 2020-01-01 00:00:00 A | 2020-01-01 00:03:00 B | 2020-01-01 01:00:00 C | 2020-01-01 01:03:00 ... | ...
Given a
- Start time
- End time
- Bucket width
- List of sensor ids
I’d like to write a query that slices the range [start_time, end_time]
into sub-intervals (buckets) of width bucket_width
and, for each bucket, check if each of the sensors in the list had a record with recorded_at
during that bucket.
So for example, assuming inputs of
- Start time =
'2020-01-01 00:00:00'
- End time =
'2020-01-01 02:00:00'
- Bucket width =
'1 hour'
- List of sensor ids =
['A', 'B', 'C']
the query should return something like
t1 | t2 | A_count | B_count | C_count ---------------------+---------------------+---------+---------+-------- 2020-01-01 00:00:00 | 2020-01-01 01:00:00 | 2 | 0 | 0 2020-01-01 01:00:00 | 2020-01-01 02:00:00 | 0 | 1 | 1
I don’t need the actual count, so I imagine LIMIT 1
will show up somewhere. I just included it in the example to help make it clear what I’m looking for.
Advertisement
Answer
One option uses generate_series()
to generate the rows, then a left join
to bring the table, and finally conditional aggregation to get the count by sensor:
select s.ts ts1, s.ts + interval '1 hour' ts2, count(*) filter (where sensor_id = 'A') a_count, count(*) filter (where sensor_id = 'B') b_count, count(*) filter (where sensor_id = 'C') c_count from generate_series('2020-01-01 00:00:00'::timestamp, '2020-01-01 02:00:00'::timestamp, '1 hour') s(ts) left join mytable t on t.recorded_at >= s.ts and t.recorded_at < s.ts + interval '1 hour' group by s.ts
Note that this would generate an extra record (from 02:00:00 to
03:00:00') as compared to your desired results. If you want to avoid that, you can slightly modify the
generate_series()` arguments, like:
... from generate_series( '2020-01-01 00:00:00'::timestamp, '2020-01-01 02:00:00'::timestamp - interval '1 second', '1 hour' ) s(ts) ...