I need help optimizing a Postgres query which uses the BETWEEN clause with a timestamp field. I have 2 tables: containing about 3394 rows containing about 4000000 rows There are btree indexes on both PKs id_one and id_two, on the FK id_one and cut_time. I want to perform a query like: This query retrieves abo…

Optimize BETWEEN date statement

I need help optimizing a Postgres query which uses the BETWEEN clause with a timestamp field.

I have 2 tables:

ONE(int id_one(PK), datetime cut_time, int f1 ...)

containing about 3394 rows

TWO(int id_two(PK), int id_one(FK), int f2 ...)

containing about 4000000 rows

There are btree indexes on both PKs id_one and id_two, on the FK id_one and cut_time.

I want to perform a query like:

select o.id_one, Date(o.cut_time), o.f1, t.f2 
from one o
inner join two t ON (o.id_one = t.id_one)
where o.cut_time between '2013-01-01' and '2013-01-31';

This query retrieves about 1.700.000 rows in about 7 seconds.

Below the explain analyze report is reported:

Merge Join  (cost=20000000003.53..20000197562.38 rows=1680916 width=24) (actual time=0.017..741.718 rows=1692345 loops=1)"
  Merge Cond: (c.coilid = hf.coilid)
  ->  Index Scan using pk_coils on coils c  (cost=10000000000.00..10000000382.13 rows=1420 width=16) (actual time=0.008..4.539 rows=1404 loops=1)
        Filter: ((cut_time >= '2013-01-01 00:00:00'::timestamp without time zone) AND (cut_time <= '2013-01-31 00:00:00'::timestamp without time zone))
        Rows Removed by Filter: 1990
  ->  Index Scan using idx_fk_lf_data on hf_data hf  (cost=10000000000.00..10000166145.90 rows=4017625 width=16) (actual time=0.003..392.535 rows=1963386 loops=1)
Total runtime: 768.473 ms

The index on the timestamp column isn’t used. How to optimize this query?

Answer

The query executes in less than one second. The other 6+ seconds are spent on traffic between server and client.

Advertisement

Answer