Skip to content
Advertisement

What is wrong with this running total (cumulative column) in Oracle select with a window function?

I have a query containing 0 or 1 in a column. For demonstration purposes I replaced the 1 with 77 so its more easy to see:

select dates.d the_date
, case TO_CHAR(dates.d, 'd') when '7' then 0 when '1' then 0 
    else 77 
  end as is_workday
from (SELECT (to_date('01.01.2019','dd.mm.yyyy') + (LEVEL -1)) 
      AS d FROM DUAL connect by level <=( 
      to_date('31.12.2020','dd.mm.yyyy')-(to_date('01.01.2019','dd.mm.yyyy')))
     ) as dates

The result is simply a consecutive date column and a marker if this date is a workday or not. (in real life I do a holiday calculation as well, but that’s not the problem):

| THE_DATE            | IS_WORKDAY |
| 2019-01-01 00:00:00 | 77         |
| 2019-01-02 00:00:00 | 77         |
| 2019-01-03 00:00:00 | 77         |
| 2019-01-04 00:00:00 | 77         |
| 2019-01-05 00:00:00 |  0         |
| 2019-01-06 00:00:00 |  0         |
| 2019-01-07 00:00:00 | 77         |
| 2019-01-08 00:00:00 | 77         |
| 2019-01-09 00:00:00 | 77         |
| 2019-01-10 00:00:00 | 77         |
....

I want to add a running total over is_workday, meaning a cumulative value. I am sure Oracles window functions are made for this.

SELECT x.the_date
, x.is_workday
, sum(x.is_workday) over (
    partition by x.the_date          -- define the window
    order by x.the_date asc          -- order inside window
    rows between unbounded preceding -- sum to top
    and current row                  -- sum ending here
  ) as workdays_cumul
FROM (
  select dates.d the_date
  , case TO_CHAR(dates.d, 'd') when '7' then 0 when '1' then 0 
      else 77 
    end as is_workday
  from (SELECT (to_date('01.01.2019','dd.mm.yyyy') + (LEVEL -1)) 
        AS d FROM DUAL connect by level <=( 
        to_date('31.12.2020','dd.mm.yyyy')-(to_date('01.01.2019','dd.mm.yyyy')))
       ) as dates
) x
order by x.the_date
;

But I must miss something here, because I do not get a running total, but just the value itself.

| THE_DATE            | IS_WORKDAY | WORKDAYS_CUMUL |
| 2019-01-01 00:00:00 |  77        |  77            |
| 2019-01-02 00:00:00 |  77        |  77            |
| 2019-01-03 00:00:00 |  77        |  77            |
| 2019-01-04 00:00:00 |  77        |  77            |
| 2019-01-05 00:00:00 |   0        |   0            |
| 2019-01-06 00:00:00 |   0        |   0            |
| 2019-01-07 00:00:00 |  77        |  77            |
| 2019-01-08 00:00:00 |  77        |  77            |
....

Obviously it should be:

| THE_DATE            | IS_WORKDAY | WORKDAYS_CUMUL |
| 2019-01-01 00:00:00 |  77        |   77           |
| 2019-01-02 00:00:00 |  77        |  154           |
| 2019-01-03 00:00:00 |  77        |  231           |
...

I thought it would wirk like this:

  • sum(x.is_workday) — do the sum over the 77-values
  • partition by x.the_date — make windows/sections/parts with one row each (in my case)
  • order by x.the_date asc — order those rows by date
  • rows between unbounded preceding — sum between very first row…
  • and current row — … and the current row.

What am I missing here?

Advertisement

Answer

Remove the PARTITION BY clause of your query as the scope you want for the window is the entire query and not each individual date:

SELECT the_date
,      is_workday
,      sum(is_workday) over (
         ORDER BY the_date asc            -- order inside window
         ROWS BETWEEN unbounded preceding -- sum to top
                  AND current row         -- sum ending here
       ) as workdays_cumul
FROM (
  select d the_date
  ,      CASE
         WHEN d - TRUNC( d, 'IW' ) IN ( 5, 6 )
         THEN 0
         ELSE 1 
         END AS is_workday
  FROM (
    SELECT DATE '2019-01-01' + LEVEL -1 AS d
    FROM   DUAL
    connect by level <= DATE '2020-12-31' - DATE '2019-01-01'
  )
)
order by the_date;

You can also use date literals, need to remove the AS in the table alias (and don’t actually need any table aliases) and can use the difference between date and the start of its ISO week (d - TRUNC( d, 'IW' )) as a method of finding Saturday and Sunday that is independent of the NLS_TERRITORY session parameter.

You could also remove the ROWS BETWEEN clause as ROWS BETWEEN unbounded preceding AND current row is the default behaviour. However, if that is the behaviour you require then you may just as well leave it in to demonstrate that that window is your expected result.

Output:

THE_DATE  | IS_WORKDAY | WORKDAYS_CUMUL
:-------- | ---------: | -------------:
01-JAN-19 |          1 |              1
02-JAN-19 |          1 |              2
03-JAN-19 |          1 |              3
04-JAN-19 |          1 |              4
05-JAN-19 |          0 |              4
06-JAN-19 |          0 |              4
07-JAN-19 |          1 |              5
08-JAN-19 |          1 |              6
09-JAN-19 |          1 |              7
...
24-DEC-20 |          1 |            518
25-DEC-20 |          1 |            519
26-DEC-20 |          0 |            519
27-DEC-20 |          0 |            519
28-DEC-20 |          1 |            520
29-DEC-20 |          1 |            521
30-DEC-20 |          1 |            522

db<>fiddle here

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement