I have a POSTGRES table with roughly the following shape:
sensor_id |         recorded_at
----------+--------------------
        A | 2020-01-01 00:00:00
        A | 2020-01-01 00:03:00
        B | 2020-01-01 01:00:00
        C | 2020-01-01 01:03:00
      ... |                 ...
Given a
- Start time
- End time
- Bucket width
- List of sensor ids
I’d like to write a query that slices the range [start_time, end_time] into sub-intervals (buckets) of width bucket_width and, for each bucket, check if each of the sensors in the list had a record with recorded_at during that bucket.
So for example, assuming inputs of
- Start time = '2020-01-01 00:00:00'
- End time = '2020-01-01 02:00:00'
- Bucket width = '1 hour'
- List of sensor ids = ['A', 'B', 'C']
the query should return something like
t1 | t2 | A_count | B_count | C_count ---------------------+---------------------+---------+---------+-------- 2020-01-01 00:00:00 | 2020-01-01 01:00:00 | 2 | 0 | 0 2020-01-01 01:00:00 | 2020-01-01 02:00:00 | 0 | 1 | 1
I don’t need the actual count, so I imagine LIMIT 1 will show up somewhere. I just included it in the example to help make it clear what I’m looking for.
Advertisement
Answer
One option uses generate_series() to generate the rows, then a left join to bring the table, and finally conditional aggregation to get the count by sensor:
select s.ts ts1, s.ts + interval '1 hour' ts2,
    count(*) filter (where sensor_id = 'A') a_count,
    count(*) filter (where sensor_id = 'B') b_count,
    count(*) filter (where sensor_id = 'C') c_count
from generate_series('2020-01-01 00:00:00'::timestamp, '2020-01-01 02:00:00'::timestamp, '1 hour') s(ts)
left join mytable t on t.recorded_at >= s.ts and t.recorded_at < s.ts + interval '1 hour'
group by s.ts
Note that this would generate an extra record (from 02:00:00  to 03:00:00') as compared to your desired results. If you want to avoid that, you can slightly modify the generate_series()` arguments, like:
...
    from generate_series(
        '2020-01-01 00:00:00'::timestamp, 
        '2020-01-01 02:00:00'::timestamp - interval '1 second', 
        '1 hour'
     ) s(ts)
...