Consider the following sample table from a soccer tournament (let’s call this table matches
)
+----------+---------+--------------+
| match_id | club_id | goals_scored |
+----------+---------+--------------+
| 1 | 1 | 1 |
| 1 | 2 | 0 |
| 2 | 1 | 1 |
| 2 | 3 | 1 |
| 3 | 1 | 0 |
| 3 | 4 | 2 |
| 4 | 2 | 2 |
| 4 | 3 | 4 |
| 5 | 2 | 4 |
| 5 | 4 | 0 |
| 6 | 3 | 1 |
| 6 | 4 | 1 |
+----------+---------+--------------+
The resulting table we want should give us each club’s total goals scored AND goals conceded:
+---------+--------------+----------------+
| club_id | goals_scored | goals_conceded |
+---------+--------------+----------------+
| 1 | 2 | 4 |
| 2 | 6 | 4 |
| 3 | 6 | 4 |
| 4 | 3 | 5 |
+---------+--------------+----------------+
Getting goals scored is straight forward enough…
SELECT SUM(goals_scored),
club_id
FROM matches
GROUP BY club_id
but I am absolutely flummoxed as to how to get it for each team’s opponents.
I could, of course, construct a pretty complex array of subqueries to get there. If this were application-side work I’d likely just stuff it in a loop and iterate over each club to get there, but my use case requires a SQL answer if possible. Any thoughts?
edit: also if anyone has any better ideas on how to title this question, I’m all ears – I’m not really sure exactly how to describe this problem in the first place.
Advertisement
Answer
We can use a self-join approach here:
SELECT
m1.club_id,
SUM(m1.goals_scored) AS goals_scored,
SUM(m2.goals_scored) AS goals_conceded
FROM matches m1
INNER JOIN matches m2
ON m2.match_id = m1.match_id AND
m2.club_id <> m1.club_id
GROUP BY
m1.club_id
ORDER BY
m1.club_id;
This approach brings the goals conceded by each club to the other club, for each match, into a single row. We then just aggregate by club to get the two sums.