I want to get the cases per million where a table contains “Province Population” and another table “Interim” contains a column that has unique Epid_ID values. I want to divide the count of these “Epid_ID” values with the population where the data type of population is big numeric and I am using Big Query. Following is the sql query I have written.
SELECT Admin_Level_1_province, count(Epid_ID) as cumulative_numbers, (count(Epid_ID)/ppopn.Population) as case_per_million FROM `interim-data.casedata.Interim` as Inte join `interim-data.casedata.Province Population` as ppopn ON Inte.Admin_Level_1_province = ppopn.Province GROUP BY Admin_Level_1_province , case_per_million
The error I am getting is “Column case_per_million contains an aggregation function, which is not allowed in GROUP BY at [7:35]”
Following is my sample data for Interim:
Admin_Level_1_province Week_number YEAR Epid_ID Province 1 21 2020 COV-NEP-PR5-RUP-20-00022 Province 2 21 2021 COV-NEP-PR5-RUP-20-00023
Following is my sample data for Province Population Table
Province Population Province 1 23456778 Province 2 12345566
Advertisement
Answer
Try below instead
SELECT Admin_Level_1_province, count(Epid_ID) as cumulative_numbers, 1000000 * (count(Epid_ID)/any_value(ppopn.Population)) as case_per_million FROM `interim-data.casedata.Interim` as Inte join `interim-data.casedata.Province_Population` as ppopn ON Inte.Admin_Level_1_province = ppopn.Province GROUP BY Admin_Level_1_province