How to remove rest of the rows with the same ID starting from the first duplicate?

Question

I have the following structure for the table DataTable: every column is of the datatype int, RowID is an identity column and the primary key. LinkID is a foreign key and links to rows of an other ...

Accepted Answer

You can use the ROW_NUMBER() window function to identify any rows that come after the original.  After that you can delete and rows with a matching LinkID and a greater than or equal to any encountered Order with a row number greater than one.  (I originally used a second CTE to get the MIN order, but I realized that it wasn&#8217;t necessary as long as the join to order was greater than equal to any order where there was a second instance of the DataUnitId.  By removing the MIN the query plan became quite simple and efficient.)WITH DataUnitInstances AS (  SELECT *    , ROW_NUMBER() OVER      (PARTITION BY LinkID, [Data], [DataSpecifier] ORDER BY [Order]) DataUnitInstanceId  FROM DataTable)DELETE FROM DataTableFROM DataTable dtINNER JOIN DataUnitInstances dup ON dup.LinkID = dt.LinkID   AND dup.[Order] <= dt.[Order]  AND dup.DataUnitInstanceId > 1Here is the output from your sample data which matches your desired result:+-------+--------+-------+------+---------------+| RowID | LinkID | Order | Data | DataSpecifier |+-------+--------+-------+------+---------------+| 1     | 120    | 1     | 1    | 1             || 2     | 120    | 2     | 1    | 3             || 3     | 120    | 3     | 1    | 10            || 4     | 120    | 4     | 1    | 13            || 7     | 371    | 1     | 6    | 2             || 8     | 371    | 2     | 3    | 5             || 9     | 371    | 3     | 8    | 1             || 10    | 371    | 4     | 10   | 1             || 11    | 371    | 5     | 7    | 2             || 12    | 371    | 6     | 3    | 3             |+-------+--------+-------+------+---------------+

Advertisement

Answer