Consider the following sample table from a soccer tournament (let’s call this table matches
)
+----------+---------+--------------+ | match_id | club_id | goals_scored | +----------+---------+--------------+ | 1 | 1 | 1 | | 1 | 2 | 0 | | 2 | 1 | 1 | | 2 | 3 | 1 | | 3 | 1 | 0 | | 3 | 4 | 2 | | 4 | 2 | 2 | | 4 | 3 | 4 | | 5 | 2 | 4 | | 5 | 4 | 0 | | 6 | 3 | 1 | | 6 | 4 | 1 | +----------+---------+--------------+
The resulting table we want should give us each club’s total goals scored AND goals conceded:
+---------+--------------+----------------+ | club_id | goals_scored | goals_conceded | +---------+--------------+----------------+ | 1 | 2 | 4 | | 2 | 6 | 4 | | 3 | 6 | 4 | | 4 | 3 | 5 | +---------+--------------+----------------+
Getting goals scored is straight forward enough…
SELECT SUM(goals_scored), club_id FROM matches GROUP BY club_id
but I am absolutely flummoxed as to how to get it for each team’s opponents.
I could, of course, construct a pretty complex array of subqueries to get there. If this were application-side work I’d likely just stuff it in a loop and iterate over each club to get there, but my use case requires a SQL answer if possible. Any thoughts?
edit: also if anyone has any better ideas on how to title this question, I’m all ears – I’m not really sure exactly how to describe this problem in the first place.
Advertisement
Answer
We can use a self-join approach here:
SELECT m1.club_id, SUM(m1.goals_scored) AS goals_scored, SUM(m2.goals_scored) AS goals_conceded FROM matches m1 INNER JOIN matches m2 ON m2.match_id = m1.match_id AND m2.club_id <> m1.club_id GROUP BY m1.club_id ORDER BY m1.club_id;
This approach brings the goals conceded by each club to the other club, for each match, into a single row. We then just aggregate by club to get the two sums.