Let’s say I have the following structure of tables;
----------------------------------------------------------- - table1 - - id(PK) | date_created | column1 | t2_id | t3_id | t4_id - - 1 ... sample 1 1 1 - - 2 ... sample 2 2 2 - - 3 ... sample 3 3 3 - ----------------------------------------------------------- ------------------- -------------------- -------------------- - table2 - - table3 - - table4 - - id(PK) | column1 - - id(PK) | column1 - - id(PK) | column1 - - 1 sample - - 1 sample - - 1 sample - - 2 sample - - 2 sample - - 2 sample - - 3 sample - - 3 sample - - 3 sample - -------------------- -------------------- --------------------
and I want a result of;
------------------------------------------------------------------------------ -date_created | column1 | table2.column1 | table3.column1 | table4.column1 - - ... sample sample sample sample - - ... sample sample sample sample - - ... sample sample sample sample - ------------------------------------------------------------------------------
So I used the following query;
SELECT table1.date_created, table1.column1, table2.column1, table3.column1, table4.column1 FROM table1, table2, table3, table4 WHERE table2.id = table1.t2_id AND table3.id = table1.t3_id AND table4.id = table1.t4_id ORDER BY table1.date_created DESC
I still have other tables that are connected to table1. The problem is the result is too slow even though there are only three results. I’m not that good at databases maybe I’m doing it wrong.
Advertisement
Answer
First of all, always remember that using JOIN
on a large table will make your query very slow. Your database structure is already considered large since you have 16 tables that are related to the table1
(That makes them 17) plus the number of data.
You only have four rows in every table at the moment, it is worth it if you mention the span time of the result of your query so we have a baseline on how slow is it and we don’t just assume.
Anyway, dividing those tables into multiple to avoid duplicate entries is a good normalization. However, you must think again about the table1
. This is the core of the relation between your tables so you must know how to handle this properly.
The solution is to create an index of every table, see Gordon Linoff answer.
Then use LEFT JOIN
so you get every row even the corresponding table is empty or the ID didn’t match.
SELECT t1.date_created, t1.column1, t2.column1, t3.column1, t4.column1 FROM table1 t1 LEFT JOIN table2 t2 ON t2.id = t1.t2_id LEFT JOIN table3 t3 ON t3.id = t1.t3_id LEFT JOIN table4 t4 ON t4.id = t1.t4_id ORDER BY t1.date_created DESC;
Then finally, pick only the column from the specific table that you really need. This is how you can enhance your query. Don’t just query it all, be smart on which data you just need.