Deleting duplicate rows with primary keys that are connected to other tables

Question

A process was causing duplicate rows in a table where there were not supposed to be any. There are several great answers to deleting duplicate rows online. But, what if those duplicates with ID primary keys all have data in other tables tied to them? Is there a way to delete all duplicates in the first table and migrate all

Accepted Answer

You cannot do this automatically. But you can do this with some queries. First, you set all the foreign keys to the correct id, which is presumably the smallest one:with ids ( select t1.*, min(id) over (partition by Model, ItemType, Color) as min_id from table1 t1 )update t2 set t2.otherid = ids.min_id from table2 t2 join ids on t2.otherid = ids.id where ids.id <> ids.min_id; Then delete the ids that are either duplicated or not referenced in table2 (depending on which you actually want):with ids ( select t1.*, min(id) over (partition by Model, ItemType, Color) as min_id from table1 t1 )delete from ids where id <> min_id;Note: If the database has concurrent users, you might want to put it in single user mode for this operation or lock the tables so they are not modified during these two operations.

Advertisement

Answer