Skip to content
Advertisement

How to select the latest date for each group by number?

I’ve been stuck on this question for a while, and I was wondering if the community would be able to direct me in the right direction?

I have some tag IDs that needs to be grouped, with exceptions (column: deleted) that need to be retained in the results. After which, for each grouped tag ID, I need to select the one with the latest date. How can I do this? An example below:

ID  |  TAG_ID  |  DATE     |  DELETED 
1   |  300     |  05/01/20 |  null
2   |  300     |  03/01/20 |  04/01/20
3   |  400     |  06/01/20 |  null
4   |  400     |  05/01/20 |  null
5   |  400     |  04/01/20 |  null
6   |  500     |  03/01/20 |  null
7   |  500     |  02/01/20 |  null

I am trying to reach this outcome:

ID  |  TAG_ID  |  DATE     |  DELETED 
1   |  300     |  05/01/20 |  null
2   |  300     |  03/01/20 |  04/01/20
3   |  400     |  06/01/20 |  null
6   |  500     |  03/01/20 |  null

So, firstly if there is a date in the “DELETED” column, I would like the row to be present. Secondly, for each unique tag ID, I would like the row with the latest “DATE” to be present.

Hopefully this question is clear. Would appreciate your feedback and help! A big thanks in advance.

Advertisement

Answer

Your results seem to be something like this:

select t.*
from (select t.*,
             row_number() over (partition by tag_id, deleted order by date desc) as seqnum
      from t
     ) t
where seqnum = 1 or deleted is not null;

This takes one row where deleted is null — the most recent row. It also keeps each row where deleted is not null.

User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement