Storing ‘Rank’ for Contests in Postgres

Question

I&#8217;m trying to determine if there a &#8220;low cost&#8221; optimization for the following query. We&#8217;ve implemented a system whereby &#8216;tickets&#8217; earn &#8216;points&#8217; and thus can be ranked. In order to support analytical type of queries, we store the rank of every ticket (tickets can …

Accepted Answer

The correlated subquery has to be executed for every row (20k times in your example). This only makes sense for a small number of rows or where the computation requires it.This derived table is computed once before we join to it:UPDATE api_ticket tSET rank = tt.rnkFROM ( SELECT tt.id , rank() OVER (PARTITION BY tt.event_id ORDER BY tt.points_earned DESC) AS rnk FROM api_ticket tt WHERE tt.status <> 'x' AND tt.event_id = ) ttWHERE t.id = tt.idAND t.rank <> tt.rnk; -- avoid empty updatesShould be quite a bit faster. 🙂Additional improvementsThe last predicate rules out empty updates:How do I (or can I) SELECT DISTINCT on multiple columns?Only makes sense if the new rank can be the old rank at least occasionally. Else remove it.We don’t need to repeat AND t.status != 'x' in the outer query since we join on the PK column id it’s the same value on both sides.And the standard SQL inequality operator is <>, even if Postgres supports !=, too.Push down the predicate event_id = into the subquery as well. No need to compute numbers for any other event_id. This was handed down from the outer query in your original. In the rewritten query, we best apply it in the subquery altogether. Since we use PARTITION BY tt.event_id, that’s not going to mess with ranks.

Advertisement

Answer

Additional improvements