I have the following table –
A B C D 1720116452 1053384214 1205809217 1053384214 1053695668 1083893937 1518906734 1740293257 1053695668 1083893937 1518906734 1740293257 1053695668 1124136593 1518906734 1740293257 1053695668 1124136593 1518906734 1740293257 1053695668 1124136593 1518906734 1740293257 1053695668 1124136593 1518906734 1740293257 1124136593 1518906734 1740293253 1053695668
How do I write a query to produce this output –
A B 1720116452 1053384214 1720116452 1205809217 1720116452 1053384214 1053695668 1083893937 1053695668 1518906734 1053695668 1740293257 1053695668 1124136593 1124136593 1740293253 1124136593 1518906734
The output table is a connection table between all the columns in the input table. For example, if the value in column A <> column B
in the input table then insert a record in the output table. If a value in column A <> column C
, then insert another record. The reason rows 5, 6, 7 and 8 are omitted from the output is because those connections already exist in the output.
The reason 1124136593 is only connected to 2 values and not 3 is because 1053695668 is already connected to it in row 7 of the output.
Advertisement
Answer
I think you are describing an unpivot operation with the removal of duplicates in either direction:
select distinct on (least(v.a, v.b), greatest(v.a, v.b)) v.a, v.b from t cross join lateral (values (a, b), (a, c), (a, d)) v(a, b) order by least(v.a, v.b), greatest(v.a, v.b);
The use of distinct on
here prevents “reverse” duplicates. It also ensures that the resulting values have a
in the first column.