I need to count consecutive days in order to define my cohorts. I have a table that looks like:
pat_id admin_date ---------------------------- 1 3/10/2019 1 3/11/2019 1 3/23/2019 1 3/24/2019 1 3/25/2019 2 12/26/2017 2 2/27/2019 2 3/16/2019 2 3/17/2019
I want such as output:
pat_id admin_date consecutive -------------------------------------------- 1 3/10/2019 1 1 3/11/2019 2 1 3/23/2019 1 1 3/24/2019 2 1 3/25/2019 3 2 12/26/2017 1 2 2/27/2019 1 2 3/16/2019 1 2 3/17/2019 2
so that I can use these consecutive days value (per pat_id) to filter for my cohort. I’ve seen few posts that suggested using DateDiff/DateAdd with row_number, such as:
datediff(day, -row_number() over (partition by mrn order by admin_date), admin_date)
but datediff/dateadd functions wouldn’t work on Netezza…
The closest I’ve got so far was:
select row_number() over (partition by mrn order by administration_date) as consecutive
which doesn’t recognize gap between dates and return such an output:
pat_id admin_date consecutive -------------------------------------------- 1 3/10/2019 1 1 3/11/2019 2 1 3/23/2019 3 1 3/24/2019 4 1 3/25/2019 5 2 12/26/2017 1 2 2/27/2019 2 2 3/16/2019 3 2 3/17/2019 4
Does anyone know how to tackle this?
Advertisement
Answer
Use lag()
to see where the groups start and a cumulative sum to define the group. The rest is just row_number()
:
select t.*, row_number() over (partition by pat_id, grp order by admin_date) as consecutive from (select t.*, sum( case when prev_ad = admin_date - interval '1 day' then 0 else 1 end) over (partition by pat_id order by admin_date) as grp from (select t.*, lag(admin_date) over (partition by pat_id order by admin_date) as prev_ad from t ) t )t ;