Let me try to explain what I want to do with my data. Structure of my data is as follows:
I have 3 columns: date, id, and stage. Stages can be different numbers, but I’m interested in showing specific transition of some id from stage -1 to stage 1. Please see the example below.
I have an
idthat was instage-1, then transited to stage1, and went back to stage-1later. Whenever the transition from-1to1is made, I want thatid.Here, I have transition from
-1to3, and that is not what interest me.Here, transition is made, and this
idis what I want to be shown.Here, we have transition from
1to-1, and that is not what I want.
Data:
#1 date id stage
2018-12-31 520000000001354292 -1
2019-09-30 520000000001354292 1
2019-12-31 520000000001354292 -1
#2 2018-12-31 520000000001435675 -1
2019-03-31 520000000001435675 -1
2019-06-30 520000000001435675 3
#3 2018-12-31 520000000003156164 -1
2019-03-31 520000000003156164 -1
2018-12-31 520000000003161014 -1
2019-03-31 520000000003161014 1
#4 2018-12-31 520500000002472437 1
2019-03-31 520500000002472437 -1
2019-06-30 520500000002472437 -1
2019-09-30 520500000002472437 2
My desired output is:
520000000001354292 520000000003156164
I hope that I explained this clear enough.
Also, after this, i want to show transition from either -1 and 1 to stage 3.
Advertisement
Answer
You could use lag():
select distinct id
from (
select t.*, lag(stage) over(partition by id order by date) lag_stage
from mytable t
) t
where lag_stage = -1 and stage = 1
This brings all ids that have at least one transition from -1 to 1.
Also, after this, i want to show transition from either -1 and 1 to stage 3.
You can easily adapt the query to fit that use case as well. Just change the final where clause to:
where lag_stage in (-1, 1) and stage = 3