I need to retrieve data from around 28 tables in my java application which I am then going to dump to Elastic Search. I need to use JDBC to query and fetch the data from the database (PostgreSQL in my case.) Are joins the most optimal way to fetch all the data?
From a SQL perspective and also from a PostgreSQL perspective (some feature of PostgreSQL I am not aware about) what is the most efficient way to retrieve data?
To add, one table is the primary table. The other tables are ‘join’ tables having only two columns containing the primary key from the the primary table and another column which acts as foreign key to another table containing more data which I am not concerned about.
Advertisement
Answer
Joining that many tables in PostgreSQL will take a very long time, not just in the execution phase, but also the planning phase. The query planner will need to consider many different candidate plans while pulling statistics from all the relevant tables. There’s also the from_collapse_limit
and join_collapse_limit
— these limits (documentation) cause the query planner to do its own re-arranging of the query and isn’t always ideal.
You are likely to be better off pulling the data out and use your programming expertise to digest the data before sending it off to ElasticSearch.
Disclosure: I work for EnterpriseDB (EDB)