i have problem with partitioning the process data i have from our workflow system. data are stored in database step after step (one step is one row) for couple of processes (id) and a date of the event. in simplified data is looks as follows:
id step date
1 a 2019-12-01
1 a 2019-12-02
1 b 2019-12-02
1 x 2019-12-03
1 a 2019-12-04
1 b 2019-12-05
2 a 2019-12-05
2 b 2019-12-06
what i need to have is such short table
id iteration first step-a first step-b
1 1 2019-12-01 2019-12-02
1 2 2019-12-04 2019-12-05
2 1 2019-12-05 2019-12-06
the “x” step is return step. it triggers whole process to start all over from the beginning, so transferring the data into small table#2 i need to split process with id=1 in 2 iterations. there could be multiply a and b steps but you need to choose the first one.
i tried with group by
and partition by
but with no success – i don’t know how to split data given x in the middle. do you have any idea?
Advertisement
Answer
I would approach this with a window sum that defines the iteration. Everytime a x
is met for a given id
, a new iteration starts.
select
id,
iteration,
min(case when step = 'a' then mdate end) first_step_a,
min(case when step = 'b' then mdate end) first_step_b
from (
select
t.*,
1 + sum(case when step = 'x' then 1 else 0 end)
over(partition by id order by mdate) iteration
from mytable t
) t
group by id, iteration
order by id, iteration
ID | ITERATION | FIRST_STEP_A | FIRST_STEP_B -: | --------: | :----------- | :----------- 1 | 1 | 2019-12-01 | 2019-12-02 1 | 2 | 2019-12-04 | 2019-12-05 2 | 1 | 2019-12-05 | 2019-12-06