Consider the following table of data: FirstName LastName Department Steve Colton Accounting Stacy Beckham Finance Mary Messi Finance Steve Colton Finance Michael Gretsky Finance As you can see Steve Colton is in both Accounting and the Finance departments. I want a query that should return Steve just once. I can do the following which works but seems like more code

How to remove duplicates out of the UNION but ignore one column

Consider the following table of data:

FirstName	LastName	Department
Steve	Colton	Accounting
Stacy	Beckham	Finance
Mary	Messi	Finance
Steve	Colton	Finance
Michael	Gretsky	Finance

As you can see Steve Colton is in both Accounting and the Finance departments.
I want a query that should return Steve just once.

I can do the following which works but seems like more code than needed:

insert #FinalData(FirstName, LastName, Department)
select * from MyTable where Department = 'Accounting'

insert #FinalData(FirstName, LastName, Department)
select * from from MyTable mt1 where mt1.Department = 'Finance'
  and not exists (
      select 1 from #FinalData fd 
      where fd.FirstName = mt1.FirstName and fd.LastName = mt1.LastName
  )

​x
 
insert #FinalData(FirstName, LastName, Department)select * from MyTable where Department = 'Accounting'​insert #FinalData(FirstName, LastName, Department)select * from from MyTable mt1 where mt1.Department = 'Finance'  and not exists (      select 1 from #FinalData fd       where fd.FirstName = mt1.FirstName and fd.LastName = mt1.LastName  )​

I am looking for a more concise way to do this. I tried UNION but there doesn’t seem to be a way to exclude a column for purposes of deduplication.

Is there another way?

Answer

You can use row_number(). If you want one row per first name (what your question implies), then:

select t.*
from (select t.*,
             row_number() over (partition by firstname order by department) as seqnum
      from MyTable 
     ) t
where seqnum = 1;

 
select t.*from (select t.*,             row_number() over (partition by firstname order by department) as seqnum      from MyTable      ) twhere seqnum = 1;​

This will choose “Accounting” over “Finance” for duplicates.

Advertisement

Answer