I am trying to show only the first occurrence of an activity if this activity occurs multiple times in a row.
Thus, in the scenario below, if activity B
occurs 3 times in a row, even if it is at different times, I want to only select the first occurrence. If it occurs again AFTER another activity I also want to keep that.
I want to disregard rows 3 & 4
ROW | CASEKEY | ACTIVITY | EVENTTIME |
---|---|---|---|
1 | 1 | A | 01.01.2021 12:00:00 |
2 | 1 | B | 01.01.2021 12:05:00 |
3 | 1 | B | 01.01.2021 12:06:00 |
4 | 1 | B | 02.01.2021 13:00:00 |
5 | 1 | C | 02.01.2021 13:10:00 |
6 | 1 | B | 03.01.2021 12:00:00 |
Advertisement
Answer
Use lag()
:
select t.* from (select t.*, lag(activity) over (partition by casekey order by eventtime) as prev_activity from t ) t where prev_activity is null or prev_activity <> activity;