Hi I need 1 query to get top 10 country which has largest [total(import) – total(export)] for goods_type medicines between 2019 – 2020.
The data sample is as below:
year | trading_type | country | goods_type | amount 2020 import ABC medicines 12345.67 2017 import ABC medicines null 2019 export DEF foods 987.65 2018 export ABC foods 2345.6 2016 export DEF medicines 120.3 2019 export ABC medicines 345.67 2020 import DEF foods 321.04 ... ... ... ... ...
The returned data should include country, goods_type, and the value of [total(imports) – total(export)].
I have come up with the query below, but I don’t know if it’s right or wrong, and I struggle to extend it to get other columns. I got an error telling select expression column ... not grouped or aggregated... in Google BigQuery console.
SELECT country, year FROM `trading_records` T    <--- error here for the year
WHERE
    T.product_type = 'medicines' AND
    (T.year = 2019 OR T.year = 2020)
GROUP BY T.country
ORDER BY (
    (SELECT SUM(amount) FROM `trading_records`
    WHERE trading_type = 'import' AND country = T.country)
    -
    (SELECT SUM(amount) FROM `trading_records`
    WHERE trading_type = 'export' AND country = T.country)
) DESC
LIMIT 10;
I appreciate any help! Thanks.
Advertisement
Answer
You can express this as a single query with GROUP BY, filtering, and conditional aggregation:
SELECT country, 
       SUM(CASE WHEN trading_type = 'import' THEN amount ELSE - amount END) as total 
FROM data 
WHERE trading_type in ('import', 'export') AND
      goods_type = 'medicines' AND
      year >= 2019 AND
      year <= 2020
GROUP BY country
ORDER BY total DESC
LIMIT 10;
Note that this does not include YEAR in the SELECT because it is “aggregated away”.