I have a requirement that needs columns with values to be transposed into rows. For instance refer to the table below:
cust:
x
cust_id | cover1 | cover2 | cover3
1234 | 'PAG' | Null | 'TDE'
5678 | Null | 'GAP' | Null
Given the above table, we have to find out which columns have a value and if there is a value in that column then there should be a row created. For e.g.
cust_id | cover
1234 | 'PAG'
1234 | 'TDE'
5678 | 'GAP'
For the customer 1234
only cover1 and cover 3
are populated hence there will be 2
records created. For 5678
cover1 & cover2
are Nulls hence, only 1
record for cover_3
needs to be created.
I could apply a simple approach like below. But I was wondering if there is an elegant approach and a smarter solution to this.
select cust_id, cover1 AS cover where cover1 IS Not Null
UNION ALL
select cust_id, cover2 AS cover where cover2 IS Not Null
UNION ALL
select cust_id, cover3 AS cover where cover3 IS Not Null
Please share your thoughts. We use Spark-SQL 2.4
Thanks
Advertisement
Answer
The requirement could be met by using Spark’s stack()
with the %sql mode:
create temporary view cust
as
select * from values (12345, 'PAG', Null, 'TDE'),
(5678, Null, 'GAP', Null) as (cust_id, cover1, cover2, cover3);
select * from
(select cust_id,
stack(3, 'cover1', cover1,
'cover2', cover2,
'cover3', cover3) as (cover_nm, cover)
from cust
)
where cover is not null;