SQL DELETE – Only keep last n records in group (delete old records in group)

Question

[Edit] : this answer helped DELETE FROM Table WHERE 'RANK' > 1 I have a table which keeps track of some entity's status history (tb_status) DDL here is an example status_id ref_entity_aa ref_entity_ab status 1 "a" 1 OK 1 "a" 2 OK 1 "b" 1 OK 1 "b" 2 OK 2 "a" 1 OK 2 "a" 2 OK 2 "b"

Accepted Answer

I use a rank() window function to determine which record is older (ordered descending). Then I use a subquery to isolate the primary key that identifies the record to be deleted.This array of primary keys can be used in a WHERE clause to delete the records based on the primary key defined in the subquery.  IF OBJECT_ID('TEMPDB..#TEMP') IS NOT NULL        DROP TABLE #TEMP        CREATE TABLE #TEMP (        PrimKey             INT        ,status_id          INT        ,ref_entity_aa      NVARCHAR(10)        ,ref_entity_ab      INT        ,[status]           NVARCHAR(10)    );        INSERT INTO #TEMP (PrimKey,status_id,ref_entity_aa,ref_entity_ab,[status])    VALUES (1000,1,'a',1,'OK')        ,(1001,1,'a',2,'OK')        ,(1002,1,'b',1,'OK')        ,(1003,1,'b',2,'OK')        ,(1004,2,'a',1,'OK')        ,(1005,2,'a',2,'OK')        ,(1006,2,'b',1,'OK')        ,(1007,2,'b',2,'ERROR')        SELECT * FROM #TEMP        DELETE #TEMP WHERE Primkey NOT IN (    SELECT Primkey     FROM(    SELECT PrimKey        ,status_id        ,RANK() OVER(PARTITION BY ref_entity_aa,ref_entity_ab,[status] ORDER BY status_id DESC) [rank]        ,ref_entity_aa        ,ref_entity_ab        ,[status]    FROM #TEMP    )A    WHERE [rank] <= 1 --'N'    )        SELECT * FROM #TEMPLet me know if this works!OUTPUT BEFORE AND AFTER DELETE:Or use this solution if you do not have a primary key. You can calculate one with your composite key columns and do the same idea as above. Just keep in mind the numerical datatypes will need to be converted to nvarchar() when there are NVARCHAR columns involved.    SELECT --Before Delete        CAST(T.[status_id] AS NVARCHAR(1))+T.ref_entity_aa+CAST(T.ref_entity_ab AS NVARCHAR(1))+T.[status]        ,* FROM #TEMP T        DELETE #TEMP WHERE CAST([status_id] AS NVARCHAR(1))+ref_entity_aa+CAST(ref_entity_ab AS NVARCHAR(1))+[status] NOT IN (    SELECT CAST([status_id] AS NVARCHAR(1))+ref_entity_aa+CAST(ref_entity_ab AS NVARCHAR(1))+[status]     FROM(    SELECT status_id        ,RANK() OVER(PARTITION BY ref_entity_aa,ref_entity_ab,[status] ORDER BY status_id DESC) [rank]        ,ref_entity_aa        ,ref_entity_ab        ,[status]    FROM #TEMP    )A    WHERE [rank] <= 1 --'N'    )    SELECT --After Delete    CAST([status_id] AS NVARCHAR(1))+ref_entity_aa+CAST(ref_entity_ab AS NVARCHAR(1))+[status]    ,T.ref_entity_aa    ,T.ref_entity_ab    ,T.[status]FROM #TEMP tOUTPUT WITH CONCATENATED KEY:(BEFORE AND AFTER)

status_id	ref_entity_aa	ref_entity_ab	status
1	“a”	1	OK
1	“a”	2	OK
1	“b”	1	OK
1	“b”	2	OK
2	“a”	1	OK
2	“a”	2	OK
2	“b”	1	OK
2	“b”	2	ERROR

Advertisement

Answer