Have a table in Hive with a following structure:
col1 col2 col3 col4 col5 col6 ----------------------------- AA NM ER NER NER NER AA NM NER ERR NER NER AA NM NER NER TER NER AA NM NER NER NER ERY
Wrote a query to fetch the record from the table:
Select distinct(col1),col2, array(concat( CASE WHEN col3=='ER' THEN 'ER' WHEN col4=='ERR' THEN 'ERR' WHEN col5=='TER' THEN 'TER' WHEN col6=='ERY' THEN 'ERY' ELSE 'NER' END
but its not working. Not getting how to go about it.
Expected O/P:
col1 col2 col3 -------------- AA NM ['ER','ERR','TER','ERY']
Any suggestion/hint will be really helpful.
Advertisement
Answer
Please try below –
select col1, col2, array( max(CASE WHEN col3=='ER' THEN 'ER' else '' end), max(CASE WHEN col4=='ERR' THEN 'ERR' else '' end), max(CASE WHEN col5=='TER' THEN 'TER' else '' end), max(CASE WHEN col6=='ERY' THEN 'ERY' else '' end)) from table group by col1, col2