This question is following this question where I wanted to select the MAX value of multiples fields while retrieving each row.
The accepted answer with UNION ALL
worked like a charm but I now have some scaling issues.
To give some context, I have more than 3 million rows in my matches table and the filters used in the WHERE
condition can reduce this dataset to about 5000-6000 rows. I’m using PostgreSQL.
The query takes something like 14-16 seconds to process. The strange thing is that if I run one query at a time, it will take 150ms.
So if my maths are corrects, the total duration of this query should be 150ms * 20 (number of fields to select max value) = 3 seconds, not 16 ??
Why the entire query takes so much time ?
Here are some questions I have about that:
- Is it just better to do 20 queries and aggregate the final result ?
- Can I speed up my query by using some index ?
- Is it possible to make the WHERE filters + JOIN only once instead of doing it in all my queries ?
PS: here is the Node.js code I use if you want to read the query in a more readable way than the 500 lines of the pastebin:
const fields = [ 'match_players.kills', 'match_players.deaths', 'match_players.assists', 'match_players.gold', 'matches.game_duration', 'match_players.minions', 'match_players.kda', 'match_players.damage_taken', 'match_players.damage_dealt_champions', 'match_players.damage_dealt_objectives', 'match_players.kp', 'match_players.vision_score', 'match_players.critical_strike', 'match_players.time_spent_living', 'match_players.heal', 'match_players.turret_kills', 'match_players.killing_spree', 'match_players.double_kills', 'match_players.triple_kills', 'match_players.quadra_kills', 'match_players.penta_kills', ] const query = fields .map((field) => { return ` (SELECT '${field}' AS what, ${field} AS amount, match_players.win as result, matches.id, matches.date, matches.gamemode, match_players.champion_id FROM match_players INNER JOIN matches ON matches.id = match_players.match_id WHERE match_players.summoner_puuid = :puuid AND match_players.remake = 0 AND matches.gamemode NOT IN (800, 810, 820, 830, 840, 850, 2000, 2010, 2020) ORDER BY ${field} DESC, matches.id LIMIT 1) ` }) .join('UNION ALL ') const { rows } = await Database.rawQuery(query, { puuid })
Thanks a lot for your time.
Advertisement
Answer
If your database engine and API support common table expressions (WITH
keyword), then you could first perform the query that makes the join and the filtering, and then use the result set for performing the UNION ALL
:
const query = ` WITH base as ( SELECT ${fields.join()}, match_players.win as result, matches.id, matches.date, matches.gamemode, match_players.champion_id FROM match_players INNER JOIN matches ON matches.id = match_players.match_id WHERE match_players.summoner_puuid = :puuid AND match_players.remake = 0 AND matches.gamemode NOT IN (800, 810, 820, 830, 840, 850, 2000, 2010, 2020) ) ` + fields.map((field) => ` (SELECT '${field}' AS what, ${field.split(".").pop()} AS amount, result, id, date, gamemode, champion_id FROM base ORDER BY 2 DESC, id LIMIT 1) `).join(' UNION ALL ');