Given the following SQL query
SELECT * 
FROM geotypes 
WHERE geotype_id IN (SELECT geotype_id 
                     FROM geotype_mixed 
                     WHERE test_name = $1) 
ORDER BY latest_date DESC 
LIMIT 100
Recently this query has been taking longer to execute and as I am new to SQL I am wondering if there is anyway to optimize it further? I feel I may not have the most optimal query.
Advertisement
Answer
In general subqueries does not perform well. Also, “IN” operator also not good when looking for performance. So, I would try changing your query as below:
SELECT * FROM geotypes gt join geotype_mixed gtm on gt.geotype_id = gtm.geotype_id WHERE gtm.test_name = $1 ORDER BY latest_date DESC LIMIT 100
since test_name is used in filtering, creating index on text_name will help improving the performance. Also, index on latest_date (but usually indexes on datetime column won’t make big difference but okay to consider. if its just date column then yes, you will see some improvement). I am assuming geotype_id would be primary key and/or foreign keys in respective tables. If that is not the case, you will need indexes on them as well.