Skip to content
Advertisement

SQL DELETE – Only keep last n records in group (delete old records in group)

[Edit] : this answer helped DELETE FROM Table WHERE ‘RANK’ > 1

I have a table which keeps track of some entity’s status history (tb_status)

DDL

here is an example

status_id ref_entity_aa ref_entity_ab status
1 “a” 1 OK
1 “a” 2 OK
1 “b” 1 OK
1 “b” 2 OK
2 “a” 1 OK
2 “a” 2 OK
2 “b” 1 OK
2 “b” 2 ERROR

I want to keep the last n records per primary key per status

so if n was 1 after the delete the table should look like this

status_id ref_entity_aa ref_entity_ab status
1 “b” 2 OK
2 “a” 1 OK
2 “a” 2 OK
2 “b” 1 OK
2 “b” 2 ERROR

This group by query returns the number of rows for each group

In the example this would be

ref_entity_aa ref_entity_ab status count
“a” 1 OK 2
“a” 2 OK 2
“b” 1 OK 2
“b” 2 OK 1
“b” 2 ERROR 1

After the delete query executes the group by query should return

ref_entity_aa ref_entity_ab status count
“a” 1 OK 1
“a” 2 OK 1
“b” 1 OK 1
“b” 2 OK 1
“b” 2 ERROR 1

The goal is to delete old records (i.e. records that have lowest value of status_id ) so that this group by query returns a count <= n

I was able to come up with this query

Advertisement

Answer

I use a rank() window function to determine which record is older (ordered descending). Then I use a subquery to isolate the primary key that identifies the record to be deleted. This array of primary keys can be used in a WHERE clause to delete the records based on the primary key defined in the subquery.

Let me know if this works!

OUTPUT BEFORE AND AFTER DELETE:

enter image description here

Or use this solution if you do not have a primary key. You can calculate one with your composite key columns and do the same idea as above. Just keep in mind the numerical datatypes will need to be converted to nvarchar() when there are NVARCHAR columns involved.

OUTPUT WITH CONCATENATED KEY:(BEFORE AND AFTER) enter image description here

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement