I have a users table with columns (id, xp, …) and around 1.5 million rows.
I am getting someone’s position in the XP leaderboard with the following query (which took 33 seconds to execute):
EXPLAIN ANALYZE WITH counts AS ( SELECT DISTINCT id, ROW_NUMBER () OVER (ORDER BY xp DESC) FROM users ) SELECT * FROM counts WHERE id=1;
Subquery Scan on counts (cost=344492.80..395160.57 rows=7404 width=16) (actual time=30683.244..32174.117 rows=1 loops=1) Filter: (counts.id = '1'::bigint) Rows Removed by Filter: 1481060 -> HashAggregate (cost=344492.80..376651.79 rows=1480702 width=24) (actual time=30679.440..32034.921 rows=1481061 loops=1) Group Key: users.id, row_number() OVER (?)" Planned Partitions: 64 Batches: 65 Memory Usage: 4369kB Disk Usage: 125960kB -> WindowAgg (cost=212155.06..238067.34 rows=1480702 width=24) (actual time=2983.137..20302.548 rows=1481061 loops=1) -> Sort (cost=212155.06..215856.81 rows=1480702 width=16) (actual time=2983.082..5040.782 rows=1481061 loops=1) Sort Key: users.xp DESC Sort Method: external merge Disk: 37760kB -> Seq Scan on users (cost=0.00..35094.02 rows=1480702 width=16) (actual time=25.467..880.626 rows=1481061 loops=1) Planning Time: 2.593 ms JIT: Functions: 14 Options: Inlining false, Optimization false, Expressions true, Deforming true" Timing: Generation 12.061 ms, Inlining 0.000 ms, Optimization 1.503 ms, Emission 26.086 ms, Total 39.650 ms" Execution Time: 32325.206 ms
My table definition:
CREATE TABLE users ( id bigint NOT NULL CONSTRAINT users_pkey PRIMARY KEY, xp bigint DEFAULT 0 NOT NULL, ... ); CREATE INDEX user_xp_leaderboard_index ON users (xp DESC, id ASC);
But it is extremely slow. Although it is not surprising considering it sorts the whole table and filter it, I don’t know how I could improve/optimize this query.
I did SET work_mem TO '1 GB';
. It helped a bit but not much.
Any help would be appreciated. Thanks in advance.
Advertisement
Answer
You can write the query like this:
select count(*) from users u where u.xp >= (select u2.xp from users u2 where u2.id = 1);
This can take advantage of an index on users(id, xp)
. This should totally eliminate any sorting. An index on users(xp)
could also be helpful if the rows are quite wide and Postgres can use an index-only scan.