I have a requirement that needs columns with values to be transposed into rows. For instance refer to the table below:
cust:
cust_id | cover1 | cover2 | cover3 1234 | 'PAG' | Null | 'TDE' 5678 | Null | 'GAP' | Null
Given the above table, we have to find out which columns have a value and if there is a value in that column then there should be a row created. For e.g.
cust_id | cover 1234 | 'PAG' 1234 | 'TDE' 5678 | 'GAP'
For the customer 1234
only cover1 and cover 3
are populated hence there will be 2
records created. For 5678
cover1 & cover2
are Nulls hence, only 1
record for cover_3
needs to be created.
I could apply a simple approach like below. But I was wondering if there is an elegant approach and a smarter solution to this.
select cust_id, cover1 AS cover where cover1 IS Not Null UNION ALL select cust_id, cover2 AS cover where cover2 IS Not Null UNION ALL select cust_id, cover3 AS cover where cover3 IS Not Null
Please share your thoughts. We use Spark-SQL 2.4
Thanks
Advertisement
Answer
The requirement could be met by using Spark’s stack()
with the %sql mode:
create temporary view cust as select * from values (12345, 'PAG', Null, 'TDE'), (5678, Null, 'GAP', Null) as (cust_id, cover1, cover2, cover3); select * from (select cust_id, stack(3, 'cover1', cover1, 'cover2', cover2, 'cover3', cover3) as (cover_nm, cover) from cust ) where cover is not null;