Pivot with changed structure of the final table

I have the following table –

A           B           C           D
1720116452  1053384214  1205809217  1053384214
1053695668  1083893937  1518906734  1740293257
1053695668  1083893937  1518906734  1740293257
1053695668  1124136593  1518906734  1740293257
1053695668  1124136593  1518906734  1740293257
1053695668  1124136593  1518906734  1740293257
1053695668  1124136593  1518906734  1740293257
1124136593  1518906734  1740293253 1053695668

​x
 
A           B           C           D1720116452  1053384214  1205809217  10533842141053695668  1083893937  1518906734  17402932571053695668  1083893937  1518906734  17402932571053695668  1124136593  1518906734  17402932571053695668  1124136593  1518906734  17402932571053695668  1124136593  1518906734  17402932571053695668  1124136593  1518906734  17402932571124136593  1518906734  1740293253 1053695668​

How do I write a query to produce this output –

A           B           
1720116452  1053384214  
1720116452  1205809217
1720116452  1053384214
1053695668  1083893937  
1053695668  1518906734  
1053695668  1740293257
1053695668  1124136593  
1124136593  1740293253
1124136593  1518906734

 
A           B           1720116452  1053384214  1720116452  12058092171720116452  10533842141053695668  1083893937  1053695668  1518906734  1053695668  17402932571053695668  1124136593  1124136593  17402932531124136593  1518906734​

The output table is a connection table between all the columns in the input table. For example, if the value in column A <> column B in the input table then insert a record in the output table. If a value in column A <> column C, then insert another record. The reason rows 5, 6, 7 and 8 are omitted from the output is because those connections already exist in the output.

The reason 1124136593 is only connected to 2 values and not 3 is because 1053695668 is already connected to it in row 7 of the output.

Answer

I think you are describing an unpivot operation with the removal of duplicates in either direction:

select distinct on (least(v.a, v.b), greatest(v.a, v.b)) v.a, v.b
from t cross join lateral
     (values (a, b), (a, c), (a, d)) v(a, b)
order by least(v.a, v.b), greatest(v.a, v.b);

 
select distinct on (least(v.a, v.b), greatest(v.a, v.b)) v.a, v.bfrom t cross join lateral     (values (a, b), (a, c), (a, d)) v(a, b)order by least(v.a, v.b), greatest(v.a, v.b);​

The use of distinct on here prevents “reverse” duplicates. It also ensures that the resulting values have a in the first column.

Advertisement

Answer