How do I Delete duplicated rows in one Table and update References in another table to the remaining row? The duplication only occurs in the name. The Id Columns are Identity columns.
Example:
Assume we have two tables Doubles
and Data
.
x
Doubles table (
Id int,
Name varchar(50)
)
Data Table (
Id int,
DoublesId int
)
Now I Have Two entries in the Doubls table:
Id Name
1 Foo
2 Foo
And two entries in the Data Table:
ID DoublesId
1 1
2 2
At the end there should be only one entry in the Doubles Table:
Id Name
1 Foo
And two entries in the Data Table:
Id DoublesId
1 1
2 1
In the doubles Table there can be any number of duplicated rows per name (up to 30) and also regular ‘single’ rows.
Advertisement
Answer
I’ve not run this, but hopefully it should be correct, and close enough to the final soln to get you there. Let me know any mistakes if you like and I’ll update the answer.
--updates the data table to the min ids for each name
update Data
set id = final_id
from
Data
join
Doubles
on Doubles.id = Data.id
join
(
select
name
min(id) as final_id
from Doubles
group by name
) min_ids
on min_ids.name = Doubles.name
--deletes redundant ids from the Doubles table
delete
from Doubles
where id not in
(
select
min(id) as final_id
from Doubles
group by name
)