Skip to content
Advertisement

Left Joining table B on table A is instantaneous when B has matches to A, but takes forever (> 1 minute) when there are no matches. Why is this?

I want to:

SELECT T.TournamentId, Sum(Score) as Score
FROM Tournament T
    LEFT JOIN Scores S on S.TournamentId = T.TournamentId
WHERE T.TournamentId = x
GROUP BY T.TournamentId

When I choose a TounamentId “x” in the WHERE clause that hasn’t started yet, the query takes forever to run. When I choose an “x” for a tournament that has started, it runs instantly.

The real query is a bit more complicated than this. But this is the root of the issue. Why would this be and what can I do to speed it up? I’d like to be able to use the same query for both cases. But if there’s nothing I can do, I’ll create a second query to run when the Tournament hasn’t started.

Advertisement

Answer

If any of the tables are very big this problem makes sense, and in that case you should try to limit it as much as possible (Some people will claim the server will do it itself, but that is not the case all the time).

Try for instance

SELECT
    T.TournamentId, SUM(Score) AS Score
FROM 
    Tournament T
LEFT JOIN 
    Scores S ON S.TournamentId = T.TournamentId AND S.TournamentId = x
WHERE 
    T.TournamentId = x
GROUP BY 
    T.TournamentId

Otherwise you can write a stored procedure and create a temporary table that is a copy of S but only contains rows where TournamentId = x. Joins are always heavy on databases and making the tables smaller before using them in a join can speed things up millions of percent.

As pointed out you can also use index to speed things up, in that case the index need to be on the values you join on, you should also consider rebuilding the index on a regular basis (easy in mssql, PAIN in mysql). And to speed things up even more you can add custom indexes to your temporary tables, especially if you have many huge tables this will make things alot faster, if it’s just a few hundred/thousand line impact will be negligible or even negative..

# Make sure the tables are not already there
DROP TEMPORARY TABLE IF EXISTS tmp_Tournament;
DROP TEMPORARY TABLE IF EXISTS tmp_Scores;

CREATE TEMPORARY TABLE tmp_Tournament
SELECT * from Tournament WHERE TournamentId = x;
CREATE INDEX tmp_Tournament_TournamentId ON tmp_Tournament (TournamentId);
CREATE TEMPORARY TABLE tmp_Scores
SELECT * FROM Scores WHERE TournamentId = x;
CREATE INDEX tmp_Scores_TournamentId ON tmp_Scores (TournamentId);

SELECT
    T.TournamentId, SUM(Score) AS Score
FROM 
    tmp_Tournament T
LEFT JOIN 
    tmp_Scores S ON S.TournamentId = T.TournamentId AND S.TournamentId = x
WHERE 
    T.TournamentId = x
GROUP BY 
    T.TournamentId;

# Just some cleanup
DROP TEMPORARY TABLE IF EXISTS tmp_Tournament;
DROP TEMPORARY TABLE IF EXISTS tmp_Scores;
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement