Skip to content
Advertisement

Need to generate percentage column in Hive after generating Count from the table

I have created following HIVE table which helps to determine the count and type of individuals from the source hive table. Now I need to create further percentage breakout from this:

CREATE TABLE hh_member_type AS 
SELECT count(*) AS count ,
CASE WHEN (LEAD_HOUSEHLD_STATUS = 'D') THEN 'Deceased' 
     WHEN (LEAD_HOUSEHLD_STATUS = 'H') THEN 'Head' 
     WHEN (LEAD_HOUSEHLD_STATUS = 'P') THEN 'Aged Parent At Home' 
     WHEN (LEAD_HOUSEHLD_STATUS = 'U') THEN 'Unknown' 
     WHEN (LEAD_HOUSEHLD_STATUS = 'W') THEN 'Spouse' 
     WHEN (LEAD_HOUSEHLD_STATUS = 'Y') THEN 'Young Adult' ELSE 'NULL' 
 END AS HH_STATUS 
FROM ( 
SELECT  LEAD_HOUSEHLD_STATUS FROM customer 
UNION ALL SELECT MEM1_HH_STATUS FROM customer 
UNION ALL SELECT MEM2_HH_STATUS FROM customer 
UNION ALL SELECT MEM3_HH_STATUS FROM customer 
UNION ALL SELECT MEM4_HH_STATUS FROM customer 
UNION ALL SELECT MEM5_HH_STATUS FROM customer 
UNION ALL SELECT MEM6_HH_STATUS FROM customer 
UNION ALL SELECT MEM7_HH_STATUS FROM customer 
UNION ALL SELECT MEM8_HH_STATUS FROM customer)A 
WHERE LEAD_HOUSEHLD_STATUS IN ('H','D','P','U','W','Y') 
GROUP BY LEAD_HOUSEHLD_STATUS ;

The output for above code is this:

hive> select * from hh_member_type ;
OK
277 Deceased
20636   Head
3547    Aged Parent At Home
18926   Unknown
16184   Spouse
7082    Young Adult
Time taken: 0.195 seconds, Fetched: 6 row(s)

I tried to create percentage break out using following code, but not getting desired output. Any suggestions would be great.

CREATE TABLE hh_member_percent AS 
SELECT a.hh_status,round(a.count/sum(a.count) * 100 ,2) 
  from hh_member_type a 
 group by a.count,a.hh_status ;

Advertisement

Answer

Use analytics sum():

SELECT a.hh_status,round(a.count/sum(a.count) over() * 100 ,2) 
  from hh_member_type a 
;
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement