Skip to content
Advertisement

How to remove duplicates out of the UNION but ignore one column

Consider the following table of data:

FirstName LastName Department
Steve Colton Accounting
Stacy Beckham Finance
Mary Messi Finance
Steve Colton Finance
Michael Gretsky Finance

As you can see Steve Colton is in both Accounting and the Finance departments.
I want a query that should return Steve just once.

I can do the following which works but seems like more code than needed:

insert #FinalData(FirstName, LastName, Department)
select * from MyTable where Department = 'Accounting'

insert #FinalData(FirstName, LastName, Department)
select * from from MyTable mt1 where mt1.Department = 'Finance'
  and not exists (
      select 1 from #FinalData fd 
      where fd.FirstName = mt1.FirstName and fd.LastName = mt1.LastName
  )

I am looking for a more concise way to do this. I tried UNION but there doesn’t seem to be a way to exclude a column for purposes of deduplication.

Is there another way?

Advertisement

Answer

You can use row_number(). If you want one row per first name (what your question implies), then:

select t.*
from (select t.*,
             row_number() over (partition by firstname order by department) as seqnum
      from MyTable 
     ) t
where seqnum = 1;

This will choose “Accounting” over “Finance” for duplicates.

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement