i have the following query in hive to get the counts per each of those columns (cluster, country and airline) as a percentage. But my percentage column contains only 0’s.. why/what am i doing wrong below?
select count(*)/ t.cnt * 100 AS percentage, cluster, country, airline from table1 CROSS JOIN (SELECT COUNT(*) AS cnt FROM table1 ) t GROUP BY cluster, country, airline
Advertisement
Answer
First, you should use window functions.
Second, beware of integer division.
I would phrase this as:
select count(*) * 100.0 / sum(count(*)) over () AS percentage, cluster, country, airline from table1 group by cluster, country, airline;