Skip to content
Advertisement

Disregard duplicate rows only when these rows are contiguous

I am trying to show only the first occurrence of an activity if this activity occurs multiple times in a row.

Thus, in the scenario below, if activity B occurs 3 times in a row, even if it is at different times, I want to only select the first occurrence. If it occurs again AFTER another activity I also want to keep that.

I want to disregard rows 3 & 4

ROW CASEKEY ACTIVITY EVENTTIME
1 1 A 01.01.2021 12:00:00
2 1 B 01.01.2021 12:05:00
3 1 B 01.01.2021 12:06:00
4 1 B 02.01.2021 13:00:00
5 1 C 02.01.2021 13:10:00
6 1 B 03.01.2021 12:00:00

Advertisement

Answer

Use lag():

select t.*
from (select t.*,
             lag(activity) over (partition by casekey order by eventtime) as prev_activity
      from t
     ) t
where prev_activity is null or prev_activity <> activity;
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement