Skip to content
Advertisement

How to count consecutive dates using Netezza

I need to count consecutive days in order to define my cohorts. I have a table that looks like:

pat_id         admin_date
----------------------------
1              3/10/2019
1              3/11/2019
1              3/23/2019
1              3/24/2019
1              3/25/2019
2              12/26/2017
2              2/27/2019
2              3/16/2019
2              3/17/2019 

I want such as output:

pat_id         admin_date      consecutive
--------------------------------------------
1              3/10/2019            1
1              3/11/2019            2
1              3/23/2019            1
1              3/24/2019            2
1              3/25/2019            3
2              12/26/2017           1
2              2/27/2019            1
2              3/16/2019            1
2              3/17/2019            2

so that I can use these consecutive days value (per pat_id) to filter for my cohort. I’ve seen few posts that suggested using DateDiff/DateAdd with row_number, such as:

datediff(day, -row_number() over (partition by mrn order by admin_date), admin_date) 

but datediff/dateadd functions wouldn’t work on Netezza…

The closest I’ve got so far was:

select row_number() over (partition by mrn order by administration_date) as consecutive 

which doesn’t recognize gap between dates and return such an output:

pat_id         admin_date      consecutive
--------------------------------------------
1              3/10/2019            1
1              3/11/2019            2
1              3/23/2019            3
1              3/24/2019            4
1              3/25/2019            5
2              12/26/2017           1
2              2/27/2019            2
2              3/16/2019            3
2              3/17/2019            4

Does anyone know how to tackle this?

Advertisement

Answer

Use lag() to see where the groups start and a cumulative sum to define the group. The rest is just row_number():

select t.*,
       row_number() over (partition by pat_id, grp order by admin_date) as consecutive
from (select t.*,
             sum( case when prev_ad = admin_date - interval '1 day' then 0 else 1 end) over
                 (partition by pat_id order by admin_date) as grp
      from (select t.*, 
                   lag(admin_date) over (partition by pat_id order by admin_date) as prev_ad
            from t
           ) t
     )t ;
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement