Skip to content
Advertisement

SQL Find duplicate sets

given this Schema:

 table tblSET 
    SetID int  PK
    SetName nvarchar(100) 


 Table tblSetItem 
    SetID int  PK
    ItemID int PK

tblSetItem.SetID is a FK into the tblSet Table.

Some data:

tblSet

 SetID    SetName
     1    Red
     2    Blue
     3    Maroon
     4    Yellow
     5    Sky

tblSetItem

 SetID    ItemID
     1       100
     1       101
     2       100
     2       108
     2       109
     3       100
     3       101
     4       101
     4       108
     4       109
     4       110
     5       100
     5       108
     5       109

I’d like a way to identify which sets contain the same items. In the example above Red and Maroon contain the same items (100,101) and Blue and Sky contain the same values (100,108,109)

Is there a sql query which would provide this answer?

Advertisement

Answer

You can use xml support to create a comma separated list (cf this answer: https://stackoverflow.com/a/1785923/215752). For this case I don’t care about the form so I leave the starting comma in.

Note, I couldn’t test this right now so I might have a typo…

select * from
(
  select SetID, setitemuniquestring,
         count(*) OVER (PARTITION BY setitemuniquestring) as cnt
  from
  (
    select S.SetID,
      (select ',' + I.ItemID
       from tblSetItem I
       where S.SetID = I.SetID
       order by u.ItemID ASC
       for xml path('')
      ) as setitemuniquestring
    from tblSet S
    group by S.SetID
  ) sub
) sub2
where cnt > 2
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement