I have created following HIVE
table which helps to determine the count and type of individuals from the source hive table. Now I need to create further percentage breakout from this:
CREATE TABLE hh_member_type AS SELECT count(*) AS count , CASE WHEN (LEAD_HOUSEHLD_STATUS = 'D') THEN 'Deceased' WHEN (LEAD_HOUSEHLD_STATUS = 'H') THEN 'Head' WHEN (LEAD_HOUSEHLD_STATUS = 'P') THEN 'Aged Parent At Home' WHEN (LEAD_HOUSEHLD_STATUS = 'U') THEN 'Unknown' WHEN (LEAD_HOUSEHLD_STATUS = 'W') THEN 'Spouse' WHEN (LEAD_HOUSEHLD_STATUS = 'Y') THEN 'Young Adult' ELSE 'NULL' END AS HH_STATUS FROM ( SELECT LEAD_HOUSEHLD_STATUS FROM customer UNION ALL SELECT MEM1_HH_STATUS FROM customer UNION ALL SELECT MEM2_HH_STATUS FROM customer UNION ALL SELECT MEM3_HH_STATUS FROM customer UNION ALL SELECT MEM4_HH_STATUS FROM customer UNION ALL SELECT MEM5_HH_STATUS FROM customer UNION ALL SELECT MEM6_HH_STATUS FROM customer UNION ALL SELECT MEM7_HH_STATUS FROM customer UNION ALL SELECT MEM8_HH_STATUS FROM customer)A WHERE LEAD_HOUSEHLD_STATUS IN ('H','D','P','U','W','Y') GROUP BY LEAD_HOUSEHLD_STATUS ;
The output for above code is this:
hive> select * from hh_member_type ; OK 277 Deceased 20636 Head 3547 Aged Parent At Home 18926 Unknown 16184 Spouse 7082 Young Adult Time taken: 0.195 seconds, Fetched: 6 row(s)
I tried to create percentage break out using following code, but not getting desired output. Any suggestions would be great.
CREATE TABLE hh_member_percent AS SELECT a.hh_status,round(a.count/sum(a.count) * 100 ,2) from hh_member_type a group by a.count,a.hh_status ;
Advertisement
Answer
Use analytics sum():
SELECT a.hh_status,round(a.count/sum(a.count) over() * 100 ,2) from hh_member_type a ;