I have two same queries but with different where condition values
explain analyse select survey_contact_id, relation_id, count(survey_contact_id), count(relation_id) from nomination where survey_id = 1565 and account_id = 225 and deleted_at is NULL group by survey_contact_id, relation_id;
explain analyse select survey_contact_id, relation_id, count(survey_contact_id), count(relation_id) from nomination where survey_id = 888 and account_id = 12 and deleted_at is NULL group by survey_contact_id, relation_id;
When I ran this two queries they both producing different result
for first query the result
GroupAggregate (cost=0.28..8.32 rows=1 width=24) (actual time=0.016..0.021 rows=4 loops=1) Group Key: survey_contact_id, relation_id -> Index Only Scan using test on nomination (cost=0.28..8.30 rows=1 width=8) (actual time=0.010..0.012 rows=5 loops=1) Index Cond: ((account_id = 225) AND (survey_id = 1565)) Heap Fetches: 5 Planning time: 0.148 ms Execution time: 0.058 ms
and for the 2nd one
GroupAggregate (cost=11.08..11.12 rows=2 width=24) (actual time=0.015..0.015 rows=0 loops=1) Group Key: survey_contact_id, relation_id -> Sort (cost=11.08..11.08 rows=2 width=8) (actual time=0.013..0.013 rows=0 loops=1) Sort Key: survey_contact_id, relation_id Sort Method: quicksort Memory: 25kB -> Bitmap Heap Scan on nomination (cost=4.30..11.07 rows=2 width=8) (actual time=0.008..0.008 rows=0 loops=1) Recheck Cond: ((account_id = 12) AND (survey_id = 888) AND (deleted_at IS NULL)) -> Bitmap Index Scan on test (cost=0.00..4.30 rows=2 width=0) (actual time=0.006..0.006 rows=0 loops=1) Index Cond: ((account_id = 12) AND (survey_id = 888)) Planning time: 0.149 ms Execution time: 0.052 ms
Can anyone explain Me why Postgres is making a BitMap Scan instead of Index Only scan ?
Advertisement
Answer
The short version is that Postgres has a cost-based approach, so it has estimated that the cost of doing so is less in the second case, based on the statistics it has.
In your case, the total cost (estimated) of each of these queries is 8.32
and 11.12
respectively. You may be able to see the cost of the index-only scan for the second query by running set enable_bitmapscan = off
.
Note that, based on its statistics, Postgres estimated that the first query would return 1 row (actually 4), and that the second would return 2 rows (actually 0).
There are several ways to get better statistics, but if analyze
(or autovacuum) hasn’t been run on that table for a while, that is a common cause of bad estimates. Another tell-tale that vacuum may not have been run recently (at least on this table) is the Heap Fetches: 5
you can see in the first query plan.
I’m confused by the “when data set increases” part of your question, please do add more context on that front if relevant.
Finally, if you’re not already planning a PostgreSQL upgrade, I highly recommend doing so soon. 9.6 is nearly out of support, and versions 10, 11, 12, and 13 each contained a host of performance focussed features.