I have an issue where some SAS code using PROC SQL isn’t working as there is a case statement around one of my calculations.
The code looks like the below:
PROC SQL; create table example as select a.var1, a.var2, a.var3, case when a.NumericVariable is not missing then Max(0,Sum(Sum(a.Var4), -Sum(a.Var5)))) end as Calculation from table1 a inner join table2 b on a.primarykey=b.primarykey group by var1,var2,var3; quit;
This code returns duplicated rows for the number of rows in table A. Removing the case statement as below:
PROC SQL; create table example as select a.var1, a.var2, a.var3, Max(0,Sum(Sum(a.Var4), -Sum(a.Var5)))) as Calculation from table1 a inner join table2 b on a.primarykey=b.primarykey group by var1,var2,var3; quit;
This just returns the group by using the levels of var1,var2 and var3.
How do I adjust the upper code to remove the duplicates?
Advertisement
Answer
This can be solved by performing the case
statement inside the summary. This only works if it’s possible to do that, though – you might have to rework your query to make it possible.
If you can’t put the case
statement inside the summarization, and the case
statement is using a variable that isn’t in the group by (and can’t be put in the group by), then it’s not possible to do what you want here – you have to change something else, maybe preprocess the data.
One example is, if you want to only include a non-zero value if var6
is not missing, but you’re okay with zero in that case, then this works:
PROC SQL; create table example as select a.var1, a.var2, a.var3, max(0,sum(sum(case when a.var6 is not missing then a.var4 end),-(sum(case when a.var6 is not missing then a.var5 end)))) as Calculation from table1 a inner join table2 b on a.primarykey=b.primarykey group by var1,var2,var3; quit;
Here’s an example that works using sashelp.class
:
proc sql; select max(0,sum(sum(case when age > 13 then height else 0 end), sum(case when age > 13 then weight else 0 end)) ) from sashelp.class group by sex; quit;