Skip to content
Advertisement

The most efficient method of performing a large number of similar SELECT queries on PostgreSQL?

I have a set of large tables with many records each. I’m writing a Python program that SELECTs a large number of records from these tables, based on the value of multiple columns on those records.

Essentially, these are going to be lots of queries of the form:

SELECT <some columns> FROM <some table> WHERE <column1=val1 AND column2=val2...>

Each table has a different set of columns, but otherwise the SELECT formula above holds.

By default, I was going to just run all these queries through the psycopg2 PostgreSQL database driver, each as a separate query. But I’m wondering if there’s a more efficient way of going about it, given that there will be a very large number of such queries – thousands or more.

Advertisement

Answer

If the SELECT list entries are the same for all queries (the same number of entries and the same data types), you can use UNION ALL to combine several such queries. This won’t reduce the amount of work for the database, but it will reduce the number of client-server round trips. This can be a huge improvement, because for short queries network latency is often the dominant cost.

If all your queries have different SELECT lists, there is nothing you can do.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement