How to delete some observations in some groups based on a condition

Question

I would like delete some observations in some groups based on a condition. My condition is each group : if (imput="toto") then delete but if a whole group has imput="toto" then ...

Accepted Answer

So you want to do two checks. Is the value toto and is the value only toto.  Here is SQL that will do that.  I had it explicitly create the checks as new variables so you can see what is happening. To see the check values for all observations remove the HAVING clause. If you are happy with it you can remove the check variables and just move the conditions into the having clause.data temp;  row+1;  input siren $ class $ imput $;cards;A CP titiB CP totoC CE tataD CE tataF CM totoG CM totoH SU tataI SU toto;proc sql ;create table want as   select *       , imput ne 'toto' as check1       , max(imput ne 'toto') as check2  from temp  group by class   having check1 or not check2  order by row;quit;To do this with just a DATA step you will want add a &#8220;double DOW Loop&#8221; so you can calculate the overall flag for a group in the first loop and then process the individual rows in the second loop. Note this requires the input dataset is  sorted (or at least grouped) by class.data want;  do until(last.class);    set temp;    by class notsorted;    if imput ne 'toto' then check2=1;  end;  do until(last.class);    set temp;    by class notsorted;    if imput ne 'toto' or not check2 then output;  end;  drop check2;run;

Advertisement

Answer