I have this query that will run on very large data sets and it is too slow.
SELECT * FROM tableA WHERE columnA = (SELECT MAX(columnA) -- select most recent date from entries we care about FROM tableA WHERE columnB = '1234' ) AND columnC in (1,2,3) -- pull a subset out of those entries, this set here can be a thousand (ish) large.
table A looks something like this
There are about 1000 distinct entries in
columnB and many orders of magnitude more in the table. Is there a better way to structure the query? Or columns I can add to the table that will make it faster?
I suspect that it will be the last line which is taking the most time because the list must be parsed.
AND columnC in (1,2,3) -- pull a subset out of those entries, this set here can be a thousand (ish) large.
It would be better to put these values in a table with an index (PRIMARY KEY) so that the query only consults the index.
Join tableX X On x.id = columnC;
We can also create indexes on columns A and B.