Large SQL table delete all duplicates

Question

I have a database table with 33265902 (thirty-three million two hundred sixty-five thousand nine hundred two) records in it. Much of this is made up of duplicates. It is causing me issues as any ...

Accepted Answer

The most efficient way is probably to keep one row per table in a new table.  Then truncate the old table and re-insert the values:select t.*  -- or select all columns except seqnuminto temp_tablefrom (select t.*,             row_number() over (partition by email_address, job_id order by job_id) as seqnum      from t     ) twhere seqnum = 1;-- optional if you remove seqnum above-- alter table temp_table drop column seqnum;-- back this up first!truncate table t;insert to t    select * from temp_table;You could drop the original table and rename temp_table to the table name.  If you do so, remember to create constraints, indexes, triggers, partitions and whatever else is on the original table.This will not be super fast (you can see how long it takes to create the temporary table).  However, it does not require the locking and logging from deleting the original table.  As a bonus, the new table should be less fragmented than if you did deletes.

Advertisement

Answer