Skip to content
Advertisement

How can I calculate the median of all rows?

What I’m trying to accomplish: I’d like to calculate the median number of rows for the last 49 days for each store and hour of the day (0 – 24 hours).

What I’ve done so far: My current SQL is below. I’m using the median function incorrectly and am receiving the following error: not a single-group group function.

SELECT 
store,
EXTRACT(HOUR FROM CAST(date AS TIMESTAMP)) AS hour_of_day,
median(count(*))

FROM table

WHERE date >= trunc(sysdate - 49) 

GROUP BY 
store,
EXTRACT(HOUR FROM CAST(date AS TIMESTAMP))

What I’d like to know: Can the median function be used in this manner? If so, is there something I have to change in the group by of the query? If not, can you recommend another way to solve this issue?

Advertisement

Answer

I don’t know if you can do this with window functions. The relevant analytic functions, such as median do not work cumulatively.

But you can use a lateral join:

with dh as (
      select trunc(date) as dte, extract(hour from date) as hh,
             shop, count(*) as cnt
      from t
     )
select dh.*, m.median_cnt
from dh cross join lateral
     (select median(cnt) as median_cnt
      from dh dh2
      where dh2.hh = dh.hh and
            dh2.dte >= dh.dte - interval '48' day and
            dh2.dte <= dte
     ) m;

Note: It is not clear what you mean by 49 days. I assume you want exactly 7 weeks which is 48 days in the past plus the current date on the row.

Also, this ignores 0 values. If you need to take those into account, then ask a new question.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement