I’ve designed a table to keep track of running processes:
Table "public.processes" Column | Type | Collation | Nullable | Default -------------------------+--------------------------+-----------+----------+--------- id | uuid | | not null | duration | bigint | | | pauses | bigint | | | start_date | timestamp with time zone | | | end_date | timestamp with time zone | | | power_levels | integer[] | | |
Where pauses
is the number of times the process has haulted and power_levels
is an array of integers from 0 to 4, with repetitions allowed, representing power levels that that process has been consuming.
Using a single query I would like to select all processes executed within a span of a certain week and display:
- A breakdown of the average duration, per day of week
- The maximum number of pauses within a single process, throughout the week
- The most commonly used power level used throughout the week
So given example data:
start_date | end_date | power_levels | duration | pauses ------------------------+------------------------+--------------+----------+------- 2020-06-06 10:00:00+00 | 2020-06-06 10:10:00+00 | {3} | 1000 | 3 2020-06-07 10:00:00+00 | 2020-06-07 10:10:00+00 | {2} | 2000 | 10 2020-06-07 12:00:00+00 | 2020-06-07 12:10:00+00 | {4,1} | 3000 | 60 2020-06-08 10:00:00+00 | 2020-06-08 10:10:00+00 | {4,2} | 4000 | 10 2020-06-08 12:00:00+00 | 2020-06-08 12:10:00+00 | {4,4,3} | 1337 | 2
I would like a result similar to:
most_used_power_level | avg | max | dow -----------------------+--------+-----+----- 4 | 2500 | 60 | 0 4 | 2668,5 | 60 | 1 4 | 1000 | 60 | 6
So far I have:
select mode() within group (order by most_used_power), avg(duration), max(pauses), extract (dow from start_date) as dow from ( select unnest(power_levels) as most_used_power, duration, pauses, start_date from processes where start_date >= '2020-06-01' and start_date < '2020-06-09' ) as foo group by dow;
But there are two issues with this query:
- Unnesting the power levels in the sub-query skew the calculation of the averages
- The final result groups the most used power level, and number of pauses per day of week, rather than the whole duration
Apart from splitting the query in 2, I’m not sure how to proceed from here. Is there a way to achieve this in a single query?
Advertisement
Answer
If you want the most used power level independent of the day of the week, then calculate it separately:
select avg(p.duration), max(p.pauses), extract(dow from start_date) as dow, pl.most_Used_power_level from processes p cross join (select mode() within group (order by power_level) as most_Used_power_level from processes p cross join lateral unnest(power_levels) power_level where p.start_date >= '2020-06-01' and p.start_date < '2020-06-09' ) pl where p.start_date >= '2020-06-01' and p.start_date < '2020-06-09' group by dow, pl.most_Used_power_level;