Given the following SQL query
SELECT *
FROM geotypes
WHERE geotype_id IN (SELECT geotype_id
FROM geotype_mixed
WHERE test_name = $1)
ORDER BY latest_date DESC
LIMIT 100
Recently this query has been taking longer to execute and as I am new to SQL I am wondering if there is anyway to optimize it further? I feel I may not have the most optimal query.
Advertisement
Answer
In general subqueries does not perform well. Also, “IN” operator also not good when looking for performance. So, I would try changing your query as below:
SELECT * FROM geotypes gt
join geotype_mixed gtm on gt.geotype_id = gtm.geotype_id
WHERE gtm.test_name = $1
ORDER BY latest_date DESC LIMIT 100
since test_name is used in filtering, creating index on text_name will help improving the performance. Also, index on latest_date (but usually indexes on datetime column won’t make big difference but okay to consider. if its just date column then yes, you will see some improvement). I am assuming geotype_id would be primary key and/or foreign keys in respective tables. If that is not the case, you will need indexes on them as well.