Skip to content
Advertisement

Query to update duplicates

I have the following query that returns duplicates that i am running on sql server. So I need to write a query that will update second instance of the email address based on the select query that retrieves dulicates to for example imports@rohnis.com to imports@rohnis.com.duplicate. If it is info@eps.ws then info@eps.ws.duplicate. So basically postfix the email address with the word duplicate. It could be any email address

Query to search duplicates

SELECT ta.Id
      ,ta.Email
      ,ta.ClientCompanyId

FROM [IdentityDB_CSR].[dbo].[User] ta
WHERE (SELECT COUNT(*)
       FROM [IdentityDB_CSR].[dbo].[User] ta2
       WHERE ta.Email=ta2.Email
       AND ta.ClientCompanyId=ta2.ClientCompanyId)>1

Output of the query

enter image description here

Query to update

update [IdentityDB_CSR].[dbo].[User]
  set Email = 'info@eps.ws.duplicate'
  where id = 87183

Advertisement

Answer

You could use an updatable cte:

with cte as (
    select
        Email,
        row_number() over(partition by Email, ClientCompanyId order by id desc) rn
    from [IdentityDB_CSR].[dbo].[User]
)
update cte 
set Email = Email + '.duplicate'
where rn > 1

This identifies duplicates as record that share the same Email and ClientCompanyId. The record that has the greatest id is left untouched, while for others we add '.duplicate at the end of the Email.

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement