Skip to content
Advertisement

Duplicate entries based on values in key/value pair table in SQL Server

I have a key/value pair table

Create Table mat_characteristics
(
 material varchar(100),
 characteristic varchar(100),
 characteristic_value varchar(100)
)
GO

Insert into mat_characterstics values
('113567','height','20.3'),
('113567','weight','11.3'),
('113567','diameter','6.3'),
('113567','length','6.3'),
('113564','height','20.3'),
('113564','length','2.3'),
('113564','length','6.3'),
('113565','height','20.3'),
('113565','weight','11.3'),
('113565','weight','2.3'),
('113565','weight','7.3'),
('113565','diameter','6.3'),
('113565','length','6.3'),
('113545','height','20.3'),
('113545','weight','11.3'),
('113545','weight','2.3'),
('113545','weight','7.3'),
('113545','diameter','6.3'),
('113545','length','6.3');

If you see the 113565 and 113545 materials are having the same characteristics and characteristic_value. According to our business rules, no two materials should be having the same characteristics and characteristic_value. And in my real case scenario, materials sometimes will be having nearly 1000 characteristics.

please help me to get the bad records identified.


I tried making the rows into columns using this query but this is resulting in more than 1300 columns and I am not sure how to proceed after this

DECLARE @DynamicPivotQuery AS NVARCHAR(MAX)
--@ColumnName is used to store the maximum number of column names
DECLARE @ColumnName AS NVARCHAR(MAX)

-- Creating temporary table
select * into #tmpmat_characterstics
from 
(select material, characterstic,characterstic_value,characterstic+cast(row_number() over (partition by material,characterstic order by characterstic) as varchar(45)) as ColumnSequence from mat_characterstics) mat_characterstics

select @ColumnName= COALESCE(@ColumnName+ ', ','') + QUOTENAME(columnsequence)
from 
(
select distinct columnsequence from #tmpmat_characterstics
) test


set @DynamicPivotQuery= 

'select material, ' +@ColumnName+ 'from #tmpmat_characterstics
PIVOT
(
    max(characterstic_value)
    for ColumnSequence in (' +@ColumnName+ ')
) PIV'


EXEC(@DynamicPivotQuery);

Advertisement

Answer

The simplest method is string_agg():

select characteristics, count(*),
       string_agg(material, ',') as materials
from (select material,
             string_agg(concat(characteristic, ':', characteristic_value), ',') as characteristcs
      from mat_characteristics
      group by material
     ) m
group by characteristics
having count(*) > 1;
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement