I have a SQL Server database of organizations, and there are many duplicate rows. I want to run a select statement to grab all of these and the amount of dupes, but also return the ids that are associated with each organization.
A statement like:
SELECT orgName, COUNT(*) AS dupes FROM organizations GROUP BY orgName HAVING (COUNT(*) > 1)
Will return something like
orgName | dupes ABC Corp | 7 Foo Federation | 5 Widget Company | 2
But I’d also like to grab the IDs of them. Is there any way to do this? Maybe like a
orgName | dupeCount | id ABC Corp | 1 | 34 ABC Corp | 2 | 5 ... Widget Company | 1 | 10 Widget Company | 2 | 2
The reason being that there is also a separate table of users that link to these organizations, and I would like to unify them (therefore remove dupes so the users link to the same organization instead of dupe orgs). But I would like part manually so I don’t screw anything up, but I would still need a statement returning the IDs of all the dupe orgs so I can go through the list of users.
Advertisement
Answer
select o.orgName, oc.dupeCount, o.id from organizations o inner join ( SELECT orgName, COUNT(*) AS dupeCount FROM organizations GROUP BY orgName HAVING COUNT(*) > 1 ) oc on o.orgName = oc.orgName