I trying to make
x
ID CATEGORY VALUE
'AAA' 'X' 123
'AAA' 'Y' 456
'BBB' 'X' 321
'BBB' 'Y' 654
into
ID VALUE_X VALUE_Y
'AAA' 123 456
'BBB' 321 654
using only SQL queries. It is kind of similar to using groupByKey() in pyspark.
Is there a way to do this?
Advertisement
Answer
Just use conditional aggregation. One method is:
select id,
max(case when category = 'X' then value end) as x_value,
max(case when category = 'Y' then value end) as y_value
from t
group by id;
In Postgres, this would be phrased using the standard filter
clause:
select id,
max(value) filter (where category = 'X'),
max(value) filter (where category = 'Y')
from t
group by id;