I have 2 tables: Activities
& Edited
they are pretty the same but Edited
contains a few additional columns that allow me to process data offline. The record may be at one of the tables or in both.
I need to take a bunch of data that will contain only a unique activityKey
. The latest result should be based on editDate
. For example:
SELECT activityKey, editDate, deleted FROM activities WHERE projectID = '4' AND deleted = 0 UNION SELECT activityKey, editDate, deleted FROM edited WHERE projectID = '4' ORDER BY editDate DESC LIMIT 20 OFFSET 0
But It only collects the data from both tables. And I get duplications like in the screenshot: WO-981
& WO-924
.
Question:
How I can remove these duplicates and use only the newest records based on editDate
.
Also, I need to remove the record if deleted = 1
at editDate
and deleted = 0
at activities
table as with WO-924
it should be removed.
Advertisement
Answer
You can use aggregation for this, if deleted
takes on only the values of 0 and 1:
SELECT activityKey, editDate, (CASE WHEN MAX(CASE WHEN deleted = 0 THEN editDate END) = MAX(editDate) THEN 0 ELSE 1 END) as deleted FROM (SELECT activityKey, editDate, deleted FROM activities WHERE projectID = '4' AND deleted = 0 UNION ALL SELECT activityKey, editDate, deleted FROM edited WHERE projectID = '4' ) ae GROUP BY activityKey ORDER BY MAX(editDate) DESC LIMIT 20 OFFSET 0;
The more general solution is to use ROW_NUMBER()
, but this works for this case.
EDIT:
If you want to skip rows with deleted
= 1, then add the HAVING
clause:
SELECT activityKey, editDate, (CASE WHEN MAX(CASE WHEN deleted = 0 THEN editDate END) = MAX(editDate) THEN 0 ELSE 1 END) as deleted FROM (SELECT activityKey, editDate, deleted FROM activities WHERE projectID = '4' AND deleted = 0 UNION ALL SELECT activityKey, editDate, deleted FROM edited WHERE projectID = '4' ) ae GROUP BY activityKey HAVING MAX(editDate) = MAX(CASE WHEN deleted = 1 THEN editDate END) ORDER BY MAX(editDate) DESC LIMIT 20 OFFSET 0;