I am looking to identify the average life time in days for open bugs based on severity.
bug | severity | status | date_assigned |
---|---|---|---|
1 | A | open | 2021-9-14 |
1 | A | in progress | 2021-9-15 |
1 | A | fixed | 2021-9-16 |
1 | A | verified | 2021-9-17 |
1 | A | closed | 2021-9-18 |
2 | B | opened | 2021-10-18 |
2 | B | in progress | 2021-10-19 |
2 | B | closed with fix | 2021-10-20 |
3 | C | open | 2021-11-20 |
3 | C | review | 2021-11-21 |
3 | C | close | 2021-11-22 |
4 | A | open | 2021-12-24 |
4 | A | closed | 2021-12-26 |
I would like to track no of days it takes for each of the severity (A, B, C, D) since their first status like (open/opened) till their status changes to closed (close/ closed/ closed with fix)
This is what I tried so far,
select severity, count(distinct date_assigned) as no_of_days from Table 1 as a join Table 1 as b where status = 'open' | 'opened' and status = 'close' | 'closed' | 'closed with fix'
But this isn’t giving me the expected output
Advertisement
Answer
Assuming that there is one “open” and “close” per bug, then you can use two levels of aggregation:
select severity, avg( close_date - open_date ) as avg_days from (select bug, severity, min(date_assigned) as open_date, max(date_assigned) as close_date from table1 where status like '%open%' or status like '%closed%' group by bug, severity ) t1 group by severity order by severity;
Note that date functions vary considerably among databases. The exact syntax for the data difference may differ from above.
EDIT:
In Databricks, I think you want:
select severity, avg( datediff(close_date, open_date) ) as avg_days