Skip to content
Advertisement

Subquery is slow

I have a mysql table with about 10 Million rows. For each row, I have an id column and date column. The id column is not unique, and for one id there are multiple rows with different values for date, typically 3-6 dates for each id. I want to select the rows with the latest date for their id.

My query:

SELECT   id,
         date
FROM     tab a
WHERE    a.date = (SELECT MAX(date)
                   FROM   tab b
                   WHERE  a.id=b.id)

is very slow and takes minutes to complete. It feels this could be done much faster. What are the best practices here?

Advertisement

Answer

Why not just ?

SELECT   id,
         MAX(date) date
FROM     tab
GROUP BY id;
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement