Skip to content
Advertisement

How to create flag 0/1 inform whether was changed or not during 4 months between two columns in PROC SQL in SAS Enterprise Guide?

I have table in SAS Enterprise Guide like below:

name of table: table123

date typ:

  • ID – numeric
  • DT – numeric
  • VAL1 – string
  • VAL2 – string
ID DT VAL1 VAL2
123 20210523 PR P
123 20211222 P P
222 20210502 PR M
222 20210711 M PR
38 20210513 PR PR
991 20211123 M P

And I need to create column “COL1” with values 0/1:

  • If ID changed value from column “VAL1” from PR to value P or M in column “VAL2” and has not changed back to PR for 4 months or more -> 1 else 0

SO I need something like below:

ID DT VAL1 VAL2 COL1
123 20210523 PR P 1
123 20211222 P P 1
222 20210502 PR M 0
222 20210711 M PR 0
38 20210513 PR PR 0
991 20211123 M P 0

Because:

  • 123 – has 1 in “COL1” because changed PR to P and does not changed back to PR for 4 month or more
  • 222 – has 0 in “COL1” because changed PR to M but did not last like this for at least 4 months (after 2 months backed to PR)
  • 38 – has 0 in “COL1” because did not change PR to P or M
  • 991 – has 0 in “COL1” because did not change PR to P or M

How can I do that in SAS Enterprise Guide in PROC SQL ?

Advertisement

Answer

You’ll want to use a data step for this one since it uses fairly complex logic. You must first convert your date into a SAS date for this to work.

data col1_flag;
    set have;
    by id dt;

    /* Do not reset these variables to missing at each row */
    retain flag_pr_to_p_pm dt_from_pr_to_p_m;
    
    /* Reset flags and dates at each new ID */
    if(first.id) then call missing(flag_pr_to_p_pm, dt_from_pr_to_p_m);

    /* If PR (VAL1) changes to P or M (VAL2), then set a flag. If VAL1
       changes back from P or M, reset the flag to 0*/
    if(VAL1 = 'PR' and VAL2 IN('P', 'M') ) then do;
        flag_pr_to_p_pm = 1;

        /* Store the number of months when it originally went from PR --> P or M */
        dt_from_pr_to_p_m = dt;     
    end;
        else if(VAL1 NOT IN ('P', 'M') ) then flag_pr_to_p_pm = 0;
    
    /* 1. We've encountered a situation where PR --> P or M
       2. It's been 4 months since PR --> P or M
       3. P or M has not changed back to PR
       4. OR: There's one row that goes from PR --> P or M
    */
    COL1 = (    flag_pr_to_p_pm
            AND intck('month', dt_from_pr_to_p_m, dt) > 4 
            AND NOT (VAL1 IN('P', 'M') AND VAL2 = 'PR')
            OR (flag_pr_to_p_pm AND first.id AND last.id)
           )
    ;

    if(COL1) then output;

    format dt_from_pr_to_p_m date9.;

    keep id;
run;

Output:

ID  DT          VAL1    VAL2    COL1
38  13MAY2021   PR      PR      0
123 23MAY2021   PR      P       1
123 22DEC2021   P       P       1
222 02MAY2021   PR      M       0
222 11JUL2021   M       PR      0
991 23NOV2021   M       P       0

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement