I have a table with 4 columns. that looks something like this.
x
+-------------------------------------------------------------+
| ID | date | a1 | a2 |
+-------------------------------------------------------------+
|1 | 31AUG2015 | 100 | 70
+-------------------------------------------------------------+
|1 | 01SEPT2015| 150 | 80
+-------------------------------------------------------------+
|2 | 31AUG2015 | 900 | 0
+-------------------------------------------------------------+
|2 | 01SEPT2015| 150 | 100
+-------------------------------------------------------------+
I want to add a1 and a2 for all rows before and up to that date, so I have something more like this:
+-------------------------------------------------------------+
| ID | date | a1 | a2 |
+-------------------------------------------------------------+
|1 | 31AUG2015 | 100 | 70
+-------------------------------------------------------------+
|1 | 01SEPT2015| 250 | 150
+-------------------------------------------------------------+
|2 | 31AUG2015 | 900 | 0
+-------------------------------------------------------------+
|2 | 01SEPT2015| 1050 | 100
+-------------------------------------------------------------+
This was my attempt. A self join up to that date:
proc sql;
create table want as
select
a.id
,a.date
,sum(a.a1)
,sum(a.a2)
from
have a,
have b
where
a.id = b.id and
a.dt <= b.dt
group by
a.id
,a.date
quit;
The results are a mess and nothing lines up with my expectations. I am sure I am making a big mistake somewhere, but I would appreciate some guidance with either fixes to the proc sql or a data step.
Advertisement
Answer
A data step will make it easy. Sort it by your id and date, then use a sum statement to accumulate your values.
proc sort data=have;
by id date;
run;
data want;
set have;
by id date;
/* Reset cumulative sum at the start of each ID */
if(first.id) then call missing(a1_cume, a2_cume);
a1_cume+a1;
a2_cume+a2;
run;