First, sorry if the title is misleading, so i make this example.
Table : data_list
Column : Serial, A1, A2, A3, A4, A5
Datas :
- 381, 0, 0, 0, 125, 99
- 537, 10002, 1234, 0, 0, 0
- 931, 0, 0, 0, 0, 0
- 1213, 123, 100, 0, 0, 10002
- 1437, 7361, 918, 17823, 0, 0
- 7777, 0, 0, 100, 0, 1234
- 7951, 125, 0, 0, 918, 0
From above example, i want the result to be :
- 381
because there is duplicate value 125 from serial 7951
- 537
because there is duplicate value 10002 from serial 1213
- 1213
because there is duplicate value 10002 from serial 537
- 7777
because there is duplicate value 100 from serial 1213
- 7951
because there is duplicate value 125 from serial 381, and value 918 from serial 1437
value 0 not counted as duplicate
I have tried several logic, also from others question from stackoverflow. Still i cant figure it out.
Anyone can help? thanks
Advertisement
Answer
In SQL Server, I would unpivot and use window functions:
select dl.serial, v.a from (select dl.serial, v.a, count(*) over (partition by v.a) as cnt from data_list dl cross apply (values (dl.a1), (dl.a2), (dl.a3), (dl.a4), (dl.a5) ) v(a) where v.a <> 0 ) dl where cnt > 1;
The above version runs the risk of returning a duplicate serial
even if the duplicate is within a single row. If that is a possibility, it can be easily modified. Although SQL Server does not support count(distinct)
as a window function, it is easy to emulate with dense_rank()
:
select dl.serial, v.a from (select dl.serial, v.a, (dense_rank() over (partition by v.a order by dl.seral asc) + dense_rank() over (partition by v.a order by dl.seral desc) ) as cnt from data_list dl cross apply (values (dl.a1), (dl.a2), (dl.a3), (dl.a4), (dl.a5) ) v(a) where v.a <> 0 ) dl where cnt > 1;