I have 2 tables: Activities & Edited they are pretty the same but Edited contains a few additional columns that allow me to process data offline. The record may be at one of the tables or in both.
I need to take a bunch of data that will contain only a unique activityKey. The latest result should be based on editDate. For example:
SELECT activityKey, editDate, deleted FROM activities WHERE projectID = '4' AND deleted = 0 UNION SELECT activityKey, editDate, deleted FROM edited WHERE projectID = '4' ORDER BY editDate DESC LIMIT 20 OFFSET 0
But It only collects the data from both tables. And I get duplications like in the screenshot: WO-981 & WO-924.
Question:
How I can remove these duplicates and use only the newest records based on editDate.
Also, I need to remove the record if deleted = 1 at editDate and deleted = 0 at activities table as with WO-924 it should be removed.
Advertisement
Answer
You can use aggregation for this, if deleted takes on only the values of 0 and 1:
SELECT activityKey, editDate,
       (CASE WHEN MAX(CASE WHEN deleted = 0 THEN editDate END) = MAX(editDate)
             THEN 0 ELSE 1
        END) as deleted
FROM (SELECT activityKey, editDate, deleted
      FROM activities
      WHERE projectID = '4' AND deleted = 0
      UNION ALL
      SELECT activityKey, editDate, deleted
      FROM edited
      WHERE projectID = '4' 
     ) ae
GROUP BY activityKey
ORDER BY MAX(editDate) DESC
LIMIT 20 OFFSET 0;
The more general solution is to use ROW_NUMBER(), but this works for this case.
EDIT:
If you want to skip rows with deleted = 1, then add the HAVING clause:
SELECT activityKey, editDate,
       (CASE WHEN MAX(CASE WHEN deleted = 0 THEN editDate END) = MAX(editDate)
             THEN 0 ELSE 1
        END) as deleted
FROM (SELECT activityKey, editDate, deleted
      FROM activities
      WHERE projectID = '4' AND deleted = 0
      UNION ALL
      SELECT activityKey, editDate, deleted
      FROM edited
      WHERE projectID = '4' 
     ) ae
GROUP BY activityKey
HAVING MAX(editDate) = MAX(CASE WHEN deleted = 1 THEN editDate END)
ORDER BY MAX(editDate) DESC
LIMIT 20 OFFSET 0;
