Skip to content
Advertisement

Using distinct with stuff/for xml path(”)

I’d like to put together only unique values in the concatenated string. My code is currently:

select rc.Routage
    , COUNT(distinct rc.Event)
    , STUFF((select ', ' + cast(rcA.Event as varchar)
            from Receiving rcA
            where rcA.supplier = 'user'
            for xml path(''))
        , 1, 1, '')
from Receiving rc
where rc.supplier = 'user'
group by rc.Routage
order by COUNT(distinct rc.Event)desc

This gives me the output I’d expect, but I would like to eliminate the duplicate values in the stuff/for xml path field.

I’ve tried various combinations of distinct and group by in the stuff/xml section, but can’t piece it together properly.

To clarify, for COUNT(distinct rc.Event) = 2, I would like to see 2 distinct events from the stuff clause. How can I do this?

Advertisement

Answer

Use select distinct in the subquery:

select rc.Routage,
       count(distinct rc.Event),
       stuff((select distinct ', ' + cast(rcA.Event as varchar(max))
              from Receiving rcA
              where rcA.supplier = 'user' and
                    rcA.DATETIME > '20170322' and
                    rc.Routage = rcA.Routage
              for xml path('')
             ), 1, 2, '')
from Receiving rc
where rc.supplier = 'user' and rc.DATETIME > '20170322'
group by rc.Routage;

Notes:

  • In SQL Server, never use varchar() (or related types) without a length. The default varies by context and you are (potentially) introducing a bug that is really hard to find.
  • You want the stuff() to remove two characters, not 1, because you have a comma followed by a space.
  • This formulation assumes that Event does not have XML special characters. It is easy to tweak if that is an issue.

Also, this type of query is usually faster if you eliminate the duplicates in a subquery:

select rc.Routage, rc.numEvents,
       stuff((select distinct ', ' + cast(rcA.Event as varchar(max))
              from Receiving rcA
              where rcA.supplier = 'user' and
                    rcA.DATETIME > '20170322' and
                    rc.Routage = rcA.Routage
              for xml path(''), type
             ).value('.', 'varchar(max)'
                    ), 1, 2, ''
            )
from (select rc.Routage, count(distinct rc.Event) as numEvents
      from Receiving rc
      where rc.supplier = 'user' and rc.DATETIME > '20170322'
      group by rc.Routage
     ) rc;
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement