Skip to content
Advertisement

How to increase performance of COUNT SQL query in PostgreSQL?

I have a table with multiply columns. But for simplicity purpose, we can consider the following table:

create table tmp_table
(
    entity_1_id varchar(255) not null,
    status integer default 1 not null,
    entity_2_id varchar(255)
);

create index tmp_table_entity_1_id_idx
    on tmp_table (entity_1_id);

create index tmp_table_entity_2_id_idx
    on tmp_table (entity_2_id);

I want to execute this request:

SELECT tmp_table.entity_2_id, COUNT(*) FROM tmp_table 
    WHERE tmp_table.entity_1_id='cedca236-3f27-4db3-876c-a6c159f4d15e' AND 
          tmp_table.status <> 2 AND 
          tmp_table.entity_2_id = ANY (string_to_array('21c5598b-0620-4a8c-b6fd-a4bfee024254,af0f9cb9-da47-4f6b-a3c4-218b901842f7', ',')) 
    GROUP BY tmp_table.entity_2_id;

It works fine, when I send string to string_to_array function with a few values (like 1-20). But when I try to send 500 elements, it works too slow. Unfortunately, I really need 100-500 elements.

Advertisement

Answer

For this query:

SELECT t.entity_2_id, COUNT(*)
FROM tmp_table t
WHERE t.entity_1_id = 'cedca236-3f27-4db3-876c-a6c159f4d15e' AND 
      t.status <> 2 AND 
      t.entity_2_id = ANY (string_to_array('21c5598b-0620-4a8c-b6fd-a4bfee024254,af0f9cb9-da47-4f6b-a3c4-218b901842f7', ',')) 
GROUP BY t.entity_2_id;

I would recommend an index on tmp_table(entity_1_id, entity_2_id, status).

However, you might find this faster:

select rst.entity_2_id,
       (select count(*)
        from tmp_table t
        where t.entity_2_id = rst.entity_2_id and
              t.entity_1_id = 'cedca236-3f27-4db3-876c-a6c159f4d15e' AND 
              t.status <> 2
       ) as cnt
from regexp_split_to_table(str, ',') rst(entity_2_id);

Then you want an index on tmp_table(entity_2_id, entity_1_id, status).

In most databases, this would be faster, because the index is a covering index and this avoids the final aggregation over the entire result set. However, Postgres stores locking information on the data pages, so they still need to be read. It is still worth trying.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement