I have a data set where there are individuals appearning multiple times with the same name and SSN but different personids. I want to remove those people and only keep those that appear one time.
declare @t table(personid int, first_name varchar(50), last_name varchar(50), ssn int) insert into @t values(1, 'Roman', 'Smith', 12), (2, 'John', 'Thomas', 15), (3, 'John', 'Thomas', 15)
In this instance I only want to return Roman Smith and eliminate John Thomas altogether.
I’ve been playing around with Window Functions and I’ve only managed to get this far:
select * from ( select person_id, first_name, last_name, middle_initial, ssn, row_number() over(partition by ssn, first_name, last_name, middle_initial order by person_id) as rn from t_person where person_id in (1, 2, 3) )a where rn = 1
Advertisement
Answer
In this scenario I like to use NOT EXISTS. Easy way to perform elimination.
SELECT * FROM @t AS T WHERE NOT EXISTS (SELECT 1 FROM @t WHERE ssn = T.ssn GROUP BY ssn HAVING COUNT(*) > 1)