I need to compare the last row of a group to the row above it, see if changes occur in a few columns, and populate a new column with 1 if a change occurs. The data presentation below will explain better.
Also need to account for having a group with only 1 row.
what we have:
Group Name Sport DogName Eligibility 1 Tom BBALL Toto Yes 1 Tom BBall Toto Yes 1 Tom golf spot Yes 2 Nancy vllyball Jimmy yes 2 Nancy vllyball rover no
what we want:
Group Name Sport DogName Eligibility N_change S_change D_Change E_change 1 Tom BBALL Toto Yes 0 0 0 0 1 Tom BBall Toto Yes 0 0 0 0 1 Tom golf spot Yes 0 1 1 0 2 Nancy vllyball Jimmy yes 0 0 0 0 2 Nancy vllyball rover no 0 0 1 1
Only care about changes from row to row within group. Thank you for any help in advance.
The rows are already ordered so we only need to last two of the group. If it is easier to compare sequential rows in a group then that is just as good for my purposes.
I did know this would be arrays and I struggle with these because never use them for my typical sas modeling. Wanted to keep things short and sweet.
Advertisement
Answer
Use the data step and lag statements. Ensure your data is sorted by group first, and that the rows within groups are sorted in the correct order. Using arrays will make your code much smaller.
The logic below will compare each row with the previous row. A flag of 1 will be set only if:
- It’s not the first row of the group
- The current value differs from the previous value.
The syntax var = (test logic);
is a shortcut to automatically generate dummy flags.
data want; set have; by group; array var[*] name sport dogname eligibility; array lagvar[*] $ lag_name lag_sport lag_dogname lag_eligibility; array changeflag[*] N_change S_change D_change E_change; do i = 1 to dim(var); lagvar[i] = lag(var[i]); changeflag[i] = (var[i] NE lagvar[i] AND NOT first.group); end; drop lag: i; run;