Postgresql Bitmap Heap Scan slow

My table looks like:

create table invoices
(
    id            serial not null,
    data          jsonb,
    modified      date,
    search_string text   not null
);

I need to search the table with ILIKE on search_string.
There may be many different search queries in the same request.

My query looks like:

SELECT *
FROM invoices
WHERE (
    search_string ILIKE '%1%'
    OR search_string ILIKE '%2%'
    OR search_string ILIKE '%3%'
)

Explain for searching without index

Seq Scan on invoices  (cost=0.00..147139.51 rows=1004406 width=1006) (actual time=0.038..2341.489 rows=1004228 loops=1)
   Filter: ((search_string ~~* '%1%'::text) OR (search_string ~~* '%2%'::text) OR (search_string ~~* '%3%'::text))
   Rows Removed by Filter: 1943
 Planning Time: 4.682 ms
 Execution Time: 2427.400 ms

I tried to make it faster by creating GIN index:

CREATE EXTENSION pg_trgm;
CREATE INDEX invoices_search_string_trigram_index ON invoices USING gin (search_string gin_trgm_ops);

Explain for searching with index

 Bitmap Heap Scan on invoices_invoice  (cost=414767.41..561902.40 rows=1004149 width=1006) (actual time=14878.331..17862.840 rows=1004228 loops=1)
  Recheck Cond: ((search_string ~~* '%1%'::text) OR (search_string ~~* '%2%'::text) OR (search_string ~~* '%3%'::text))
  Rows Removed by Index Recheck: 1943
  Heap Blocks: exact=63341 lossy=66186
  ->  BitmapOr  (cost=414767.41..414767.41 rows=1006171 width=0) (actual time=14842.199..14842.199 rows=0 loops=1)
        ->  Bitmap Index Scan on trgm_idx_search_string  (cost=0.00..137979.36 rows=874048 width=0) (actual time=4520.466..4520.466 rows=546232 loops=1)
              Index Cond: (search_string ~~* '%1%'::text)
        ->  Bitmap Index Scan on trgm_idx_search_string  (cost=0.00..138208.03 rows=904538 width=0) (actual time=4357.453..4357.453 rows=546232 loops=1)
              Index Cond: (search_string ~~* '%2%'::text)
        ->  Bitmap Index Scan on trgm_idx_search_string  (cost=0.00..137826.91 rows=853721 width=0) (actual time=5964.276..5964.276 rows=546232 loops=1)
              Index Cond: (search_string ~~* '%3%'::text)
Planning Time: 1.198 ms
Execution Time: 17971.102 ms

Why are my index searches slower than seq scan?
Is there any way to make this type of search faster?

Answer

Your problem are probably the 66186 lossy blocks. Increase work_mem until you have only exact blocks.

Considering that you have a million result rows, I’d say that this query will never be very fast unless you reduce the number to result rows.

Advertisement

Answer