Skip to content
Advertisement

Sum columns depending on another column value

I’m having trouble with summing the fields values based on another fields value.

I need to SUM(activities.points) based on activities.activity_type if it’s used_points or added_points and put it in AS used_points/added_points.

Table activities:

id | subscription_id  | activity_type | points
--------------------------------------------------
1  |         1        |  used_points  |   10
2  |         1        |  used_points  |   50 
3  |         1        | added_points  |   20 
4  |         1        | added_points  |   30
5  |         2        |  used_points  |   20
6  |         2        |  used_points  |   45
7  |         2        | added_points  |   45
8  |         2        | added_points  |   45

Table subscriptions:

id |     name    |  current_points
-------------------------------------
 1 |    card_1   |        700
 2 |    card_2   |        900

What I need:

  name   |  current_points  | used_points | added_points
-----------------------------------------------------------
 card_1  |         700      |     60      |     50
 card_2  |         900      |     65      |     90

What I tried :

SELECT
    subscriptions.name,
    subscriptions.current_points,
    IF(activities.activity_type="used_points", SUM(activities.points), null)
        AS used_points,
    IF(activities.activity_type="added_points", SUM(activities.points), null)
        AS added_points 
FROM activities
JOIN subscriptions
    ON activities.subscription.id = subscription.id
GROUP BY subscriptions.name

Which is wrong.

Thanks

Advertisement

Answer

You want to use SUM(IF( )). You want to add up the values returned from the IF. You want that IF expression to be evaluated for each individual row. Then, use the SUM aggregate to add up the value returned for each row.

Remove the SUM aggregate from inside the IF expression and instead, wrap the IF inside a SUM.


Followup

Q But why SUM() inside of IF doesn’t work ?

A Well, it does work. It’s just not working the way you want it to work.

The MySQL SUM function is an “aggregate” function. It aggregates rows together, and returns a single value.

For an expression of this form: IF(col='foo',SUM(numcol),0)

What MySQL is doing is aggregating all the rows into the SUM, and returning a single value.

Other databases would pitch a fit, and throw an error with the reference to the non-aggregate col in that expression. MySQL is more lenient, and treats the col reference like it was an aggregate (like MIN(col), or MAX(col)… working on a group of rows, and returning a single value. In this case, MySQL is selecting a single, sample row. (It’s not determinate which row will be “chosen” as the sample row.) So that reference to col is sort of like a GET_VALUE_FROM_SAMPLE_ROW(col). Once the aggregates are completed, then that IF expression gets evaluated once.

If you start with this query, this is the set of rows you want to operate on.

SELECT s.name
     , s.current_points
     , a.activity_type
     , a.points
     , IF(a.activity_type='used_points',a.points,NULL) AS used_points
     , IF(a.activity_type='added_points',a.points,NULL) AS added_points
  FROM subscriptions s
  JOIN activities a
    ON a.subscription_id = s.id

When you add a GROUP BY clause, that’s going to aggregate some of those rows together. What you will get back for the non-aggregates is values from a sample row.

Try adding GROUP BY s.name to the query, and see what is returned.

Also try adding in some aggregates, such as SUM(a.points)

SELECT s.name
     , s.current_points
     , a.activity_type
     , a.points
     , IF(a.activity_type='used_points',a.points,NULL) AS used_points
     , IF(a.activity_type='added_points',a.points,NULL) AS added_points
     , SUM(a.points) AS total_points
  FROM subscriptions s
  JOIN activities a
    ON a.subscription_id = s.id
 GROUP BY s.name

Finally, we can add in the expressions in your query into the SELECT list:

     , IF(a.activty_type='used_points',SUM(a.points),NULL) AS if_used_sum
     , IF(a.activty_type='added_points',SUM(a.points),NULL) AS if_added_sum

What we discover is that the value returned from these expressions will either be SUM(a.points), which will match the total_points, or it will be NULL. And we can see the value of the activity_type column, retrieved from a single, sample row for each group, and we can see that this is expression is “working”, it’s just not doing what we you really want to happen: for the conditional test to run on each individual row, returning a value for points or a null, and then summing that up for the group.

User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement