I want to return all rows that were public in May (2019-05), so if a row was turned to draft (and not back to public) at any point before the end of May, I don’t want it. For example:
id | post_id | status | date ------------------------- 1 | 1 | draft | 2019-03-25 2 | 1 | public | 2019-04-02 3 | 1 | draft | 2019-05-25 4 | 2 | draft | 2019-03-10 5 | 2 | public | 2019-04-01 6 | 2 | draft | 2019-06-01
The desired result for the above would return post_id 2
because its last status change prior to the end of May was to public
.
post_id 1
was put back in draft before the end of May, so it would not be included.
I’m not sure how to use the correct join or sub-queries to do this as efficiently as possible.
Advertisement
Answer
You seem to want the status as of 2019-05-31. A correlated subquery seems like the simplest solution:
select t.* from t where t.date = (select max(t2.date) from t t2 where t2.post_id = t.post_id and t2.date <= '2019-05-31' );
To get the ones that are public, just add a WHERE
condition:
select t.* from t where t.date = (select max(t2.date) from t t2 where t2.post_id = t.post_id and t2.date <= '2019-05-31' ) and t.status = 'public';
For performance, you want an index on (post_id, date)
.
You can also phrase this using a JOIN
:
select t.* from t join (select t2.post_id, max(t2.date) as max_date from t t2 where t2.date <= '2019-05-31' group by t2.post_id ) t2 on t2.max_date = t.date where t.status = 'public';
I would expect the correlated subquery to have better performance with the right indexes. However, sometimes MySQL surprises me.