I have a table with the columns:
- date (timestamp)
- num (integer)
Looks like this in CSV:
"date","num" "2018-02-07 00:00:00","1" "2018-02-16 00:00:00","1" "2018-03-02 00:00:00","4" "2018-04-04 00:00:00","6" "2018-06-07 00:00:00","1"
I want different queries to figure out the following:
- A: The earliest date that the sum of num is >= 1
- B: The earliest date that the sum of num is >= 2
In the sample data the output would be A: 2018-02-07 and B: 2018-02-16.
Note that if the first date in the data had a num higher than 1 then A and B would both equal the same date.
Grouping and using MIN(date) would be good enough to satisfy A but I can’t figure out how to get B to work if there are two days with num = 1 right after another. Any ideas are appreciated.
Advertisement
Answer
Use a cumulative sum. For a single number:
select t.* from (select t.*, sum(num) over (order by date) as running from t ) t where running >= 1 and running - num < 1 order by date limit 1;
If you wanted multiple thresholds at the same time:
select min(date) filter (where running >= 1) as date_1, min(date) filter (where running >= 2) as date_2 from (select t.*, sum(num) over (order by date) as running from t ) t;
Or, if you want them on separate rows:
select distinct on (threshold) v.threshold, t.* from (select t.*, sum(num) over (order by date) as running from t ) t cross join (values (1), (2)) v(threshold) where running >= threshold and running - num < threshold order by threshold, date