Skip to content
Advertisement

process/transaction data partitioning

i have problem with partitioning the process data i have from our workflow system. data are stored in database step after step (one step is one row) for couple of processes (id) and a date of the event. in simplified data is looks as follows:

id     step    date
1       a       2019-12-01
1       a       2019-12-02
1       b       2019-12-02
1       x       2019-12-03
1       a       2019-12-04
1       b       2019-12-05
2       a       2019-12-05
2       b       2019-12-06

what i need to have is such short table

id  iteration   first step-a       first step-b
1    1           2019-12-01         2019-12-02
1    2           2019-12-04         2019-12-05
2    1           2019-12-05         2019-12-06

the “x” step is return step. it triggers whole process to start all over from the beginning, so transferring the data into small table#2 i need to split process with id=1 in 2 iterations. there could be multiply a and b steps but you need to choose the first one.

i tried with group by and partition by but with no success – i don’t know how to split data given x in the middle. do you have any idea?

Advertisement

Answer

I would approach this with a window sum that defines the iteration. Everytime a x is met for a given id, a new iteration starts.

select
    id,
    iteration,
    min(case when step = 'a' then mdate end) first_step_a,
    min(case when step = 'b' then mdate end) first_step_b
from (
    select 
        t.*,
        1 + sum(case when step = 'x' then 1 else 0 end) 
            over(partition by id order by mdate) iteration
    from mytable t
) t
group by id, iteration
order by id, iteration

Demo on DB Fiddle:

ID | ITERATION | FIRST_STEP_A | FIRST_STEP_B
-: | --------: | :----------- | :-----------
 1 |         1 | 2019-12-01   | 2019-12-02  
 1 |         2 | 2019-12-04   | 2019-12-05  
 2 |         1 | 2019-12-05   | 2019-12-06  
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement