I’ve got solutions here in stack overflow for SQLZoo Join Lesson Challenge 13
- problem link: https://sqlzoo.net/wiki/The_JOIN_operation
- solution link: SQLzoo JOIN tutorial #13
But, before to use the solution that I found here, I was trying:
SELECT ga.mdate, ga.team1, COUNT(go1.teamid), ga.team2, COUNT(go2.teamid) FROM game ga LEFT JOIN goal go1 ON id=go1.matchid AND go1.teamid=ga.team1 LEFT JOIN goal go2 ON id=go2.matchid AND go2.teamid=ga.team2 GROUP BY ga.mdate, ga.team1, ga.team1, ga.team2 ORDER BY ga.mdate, go1.matchid, ga.team1, ga.team2
But the answer is always wrong. I have a spreadsheet where I put the results Correct X Actual.
- link to the results I’ve got: https://docs.google.com/spreadsheets/d/1uTHfgbvuDBbPrVIaXX-i-RGny5uK1j6wJYwmTNUZ7sQ/edit#gid=0
It seems that:
- in cases where both teams have done at least 1 goal, the query is multiplying the number of goals (instead of showing the number of goals of each team).
- in other cases (when at least 1 team has 0 goals), the query is returning the right values.
So far I understood of SQL, I think it should work. So I see that I’m missing some SQL concept here.
Could please someone help me to understand it better?
Thanks in advance!
Advertisement
Answer
Because you join to goal
twice, you effectively join every goal for one team to every goal for the other team. The two JOIN
s are effectively independent.
So, for instance, if there were three goals for each team, we end up with nine rows before we do any aggregation. This also means that each goal (for each team) appears 3 times in those rows. You then, of course, count all of those rows and get the “multiplying” effect you describe.