How to iterate rows with known start position and mixed column ordering, exactly as if walking the corresponding index?

Question

I have a given, unmodifiable table design which resembles: I simply want to iterate rows from this table in the order c1 asc, c2 desc, starting at a known position, e.g., ('foo', 5). The query must be efficient, i.e. the query planner must be able to take advantage of the unique index and do an index scan if the table

Accepted Answer

I suspect that the trickiest part will stopping it from using the index. Which is to say just walking the index from one end discarding stuff that doesn’t match the filter.SELECT * FROM t1 WHERE (c1 = 'CXB' AND c2 <= 57) OR c1 > 'CXB'ORDER BY c1 ASC, c2 DESC LIMIT 999;The planner isn’t smart enough (afaict in 2021) to spot that the OR there has a strong correlation on each side. We know that we are actually just trying to get a subset of the table, but the planner – nope.So – one thing you can do with a tricky OR is to split it into two subqueries and use UNION ALL.WITH part1 AS ( SELECT * FROM t1 WHERE (c1 = 'CXB' AND c2 <= 57) ORDER BY c1 ASC, c2 DESC LIMIT 999), part2 AS ( SELECT * FROM t1 WHERE c1 > 'CXB' ORDER BY c1 ASC, c2 DESC LIMIT 999)SELECT * FROM part1UNION ALLSELECT * FROM part2ORDER BY c1 ASC, c2 DESC LIMIT 999;I’ve written this as a CTE (the WITH clauses) because I think it looks more readable that way. It should be identical to actual subqueries for the planner though.That gives me (EXPLAIN ANALYSE ...) a plan something like: Limit (cost=110.23..177.47 rows=999 width=16) (actual time=0.232..2.172 rows=999 loops=1) -> Merge Append (cost=110.23..181.31 rows=1056 width=16) (actual time=0.230..1.872 rows=999 loops=1) Sort Key: t1.c1, t1.c2 DESC -> Sort (cost=109.80..109.94 rows=57 width=16) (actual time=0.141..0.155 rows=57 loops=1) Sort Key: t1.c1, t1.c2 DESC Sort Method: quicksort Memory: 29kB -> Limit (cost=0.42..107.56 rows=57 width=16) (actual time=0.055..0.104 rows=57 loops=1) -> Index Scan using t1u1 on t1 (cost=0.42..107.56 rows=57 width=16) (actual time=0.055..0.085 rows=57 loops=1) Index Cond: (((c1)::text = 'CXB'::text) AND (c2 <= 57)) -> Limit (cost=0.42..50.81 rows=999 width=16) (actual time=0.085..1.403 rows=942 loops=1) -> Index Scan using t1u1 on t1 t1_1 (cost=0.42..40347.24 rows=799950 width=16) (actual time=0.085..1.143 rows=942 loops=1) Index Cond: ((c1)::text > 'CXB'::text) Planning Time: 0.677 ms Execution Time: 2.387 msYou incur an extra sort I’m afraid, but you can’t guarantee your ordering otherwise.Obviously this doesn’t give your your odd queries that should be returning true or an empty set. That’s because they don’t really make any sense in the sense of asking for a subset of the table.Test data used for above query plan:CREATE FUNCTION threechars(i int) RETURNS TEXT AS $$ SELECT chr((i / (26*26)) % 26 + 65) || chr((i / 26) % 26 + 65) || chr(i % 26 + 65) $$ LANGUAGE sql IMMUTABLE ;CREATE TABLE t1 (c1 varchar(64) not null, c2 bigint not null, i int not null);CREATE UNIQUE INDEX t1u1 ON t1 (c1 ASC, c2 DESC);INSERT INTO t1 SELECT threechars(i / 99), 99 - (i % 99), i FROM generate_series(0, 999999) i;ANALYSE t1;

Advertisement

Answer