Skip to content
Advertisement

MySQL: select row only where closest to date has column value

I want to return all rows that were public in May (2019-05), so if a row was turned to draft (and not back to public) at any point before the end of May, I don’t want it. For example:

id | post_id | status | date
-------------------------
 1 | 1       | draft  | 2019-03-25
 2 | 1       | public | 2019-04-02
 3 | 1       | draft  | 2019-05-25
 4 | 2       | draft  | 2019-03-10
 5 | 2       | public | 2019-04-01
 6 | 2       | draft  | 2019-06-01

The desired result for the above would return post_id 2 because its last status change prior to the end of May was to public.

post_id 1 was put back in draft before the end of May, so it would not be included.

I’m not sure how to use the correct join or sub-queries to do this as efficiently as possible.

Advertisement

Answer

You seem to want the status as of 2019-05-31. A correlated subquery seems like the simplest solution:

select t.*
from t
where t.date = (select max(t2.date)
                from t t2
                where t2.post_id = t.post_id and
                      t2.date <= '2019-05-31'
               );

To get the ones that are public, just add a WHERE condition:

select t.*
from t
where t.date = (select max(t2.date)
                from t t2
                where t2.post_id = t.post_id and
                      t2.date <= '2019-05-31'
               ) and
      t.status = 'public';

For performance, you want an index on (post_id, date).

You can also phrase this using a JOIN:

select t.*
from t join
     (select t2.post_id, max(t2.date) as max_date
      from t t2
      where t2.date <= '2019-05-31'
      group by t2.post_id
     ) t2
     on t2.max_date = t.date
where t.status = 'public';

I would expect the correlated subquery to have better performance with the right indexes. However, sometimes MySQL surprises me.

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement