I have a table with 4 columns. that looks something like this.
+-------------------------------------------------------------+ | ID | date | a1 | a2 | +-------------------------------------------------------------+ |1 | 31AUG2015 | 100 | 70 +-------------------------------------------------------------+ |1 | 01SEPT2015| 150 | 80 +-------------------------------------------------------------+ |2 | 31AUG2015 | 900 | 0 +-------------------------------------------------------------+ |2 | 01SEPT2015| 150 | 100 +-------------------------------------------------------------+
I want to add a1 and a2 for all rows before and up to that date, so I have something more like this:
+-------------------------------------------------------------+ | ID | date | a1 | a2 | +-------------------------------------------------------------+ |1 | 31AUG2015 | 100 | 70 +-------------------------------------------------------------+ |1 | 01SEPT2015| 250 | 150 +-------------------------------------------------------------+ |2 | 31AUG2015 | 900 | 0 +-------------------------------------------------------------+ |2 | 01SEPT2015| 1050 | 100 +-------------------------------------------------------------+
This was my attempt. A self join up to that date:
proc sql; create table want as select a.id ,a.date ,sum(a.a1) ,sum(a.a2) from have a, have b where a.id = b.id and a.dt <= b.dt group by a.id ,a.date quit;
The results are a mess and nothing lines up with my expectations. I am sure I am making a big mistake somewhere, but I would appreciate some guidance with either fixes to the proc sql or a data step.
Advertisement
Answer
A data step will make it easy. Sort it by your id and date, then use a sum statement to accumulate your values.
proc sort data=have; by id date; run; data want; set have; by id date; /* Reset cumulative sum at the start of each ID */ if(first.id) then call missing(a1_cume, a2_cume); a1_cume+a1; a2_cume+a2; run;