I’m trying to query a table and calculate the running sum of a column’s values for pairwise combinations of two other columns.
Specifically, given the following table:
CREATE TABLE test ( bucket int(10) NOT NULL, label varchar(10) NOT NULL, amount int(10) NOT NULL ); INSERT INTO test VALUES (1, "A", 1), (1, "B", 2), (1, "C", 3), (2, "A", 4), (2, "B", 5), (2, "C", 6), (3, "A", 7), (3, "B", 8), (3, "C", 9), (4, "A", 10), (4, "B", 11), (4, "C", 12);
I want to query a column that calculates the cumulative sum for each bucket/label pair.
Put another way, I want to be able to write a query that selects bucket, label, and a third column that’s the cumulative sum of the “amount” column for every entry with a label that matches that row’s label. For example, for the above example, the third column should have the values:
1, 2, 3, 5, 7, 9, 12, 15, 18, 22, 26, 30
In other words, the value 5 is the sum of 1 and 4 (the first two amounts for the “A” label), the value 7 is the sum of 2 and 5 (the first two amounts of the “B” label), etc.
I know I can get the cumulative sum for the entire column by doing something like:
SELECT *, SUM(amount) OVER ( ORDER BY amount ) AS running_total FROM test
but I’m not sure how to split it up by label.
Any help is much appreciated!
Advertisement
Answer
You want a partition by
clause:
SELECT t.*, SUM(amount) OVER (PARTITION BY label ORDER BY amount ) AS running_total FROM test t ORDER BY bucket, amount;
Although ORDER BY amount
works for your data, I think you might really want the bucket as well:
SELECT t.*, SUM(amount) OVER (PARTITION BY label ORDER BY bucket, amount ) AS running_total FROM test t ORDER BY bucket, amount;
Here is a db<>fiddle. Note that this uses MySQL 8, because your create/insert code is compatible with MySQL.