Speed up removal of duplicates in Oracle with indexing

Question

How to remove duplicate entries from a large Oracle table (200M rows, 20 columns)? The below query from 2014 is slow. It took 2 minutes to delete 1 duplicate entry for one specific combination of columns (i.e. where col1 = 1 and .. col20 = &#8216;Z&#8217;). Any way to speed it up, e.g. with indexing? Answer R…

Accepted Answer

Rather than using an anti-join (and finding the non-matching ROWID and then deleting all the others), you can use the ROW_NUMBER analytic function to directly find the ROWIDs to delete:DELETE FROM sch.table1 tWHERE EXISTS(  SELECT 1  FROM   (    SELECT ROW_NUMBER() OVER (             PARTITION BY col1, col2, col3, col4, ..., col20             ORDER BY rowid           ) AS rn    FROM   sch.table1  ) x  WHERE x.ROWID = t.ROWID  AND   x.rn > 1);or:DELETE FROM sch.table1 tWHERE ROWID IN (  SELECT ROWID  FROM   (    SELECT ROW_NUMBER() OVER (             PARTITION BY col1, col2, col3, col4, ..., col20             ORDER BY rowid           ) AS rn    FROM   sch.table1  )  WHERE rn > 1);fiddle

Advertisement

Answer