I wish to output a sum of revenue per year per product, but also include a column with the year of introduction of each product (based on date first ordered).
I can not figure out what I am doing wrong here as I can do each calculation separately but not it seems within the same query, I am sure this is a basic misunderstanding of grouping or aggregation but I can not figure it out.
With input_data AS ( SELECT '#1238' as order_id, DATE('2021-12-15') as order_date, 'c12345' as sku, 18 as order_value UNION ALL SELECT '#1201', DATE('2021-10-10'), 'c12345', 18 UNION ALL SELECT '#1198', DATE('2021-07-05'), 'c12345', 20 UNION ALL SELECT '#1134', DATE('2020-10-15'), 'c12345', 10 UNION ALL SELECT '#1112', DATE('2019-08-10'), 'c12345', 5 UNION ALL SELECT '#1234', DATE('2021-07-05'), 'c11111', 118 UNION ALL SELECT '#1294', DATE('2021-01-05'), 'c11111', 68 UNION ALL SELECT '#1290', DATE('2021-01-01'), 'c11111', 82 UNION ALL SELECT '#1284', DATE('2020-01-15'), 'c22222', 98 ) SELECT sku, --format_date('%Y', min(order_date)) as year_intro1, format_date('%Y', min(order_date) over (partition BY sku)) as year_intro2, format_date('%Y', order_date) as year_order, sum(order_value) as year_rev FROM input_data GROUP BY 1,3
The year_intro will always be one value per sku
I think I want year_intro2 which gives me the error.
year_intro1 works but just picks up the min value within each revenue year (as below).
The desired output for c12345 year_intro has a value of 2019 for all year_order values.
Advertisement
Answer
Consider below approach
select sku, any_value(year_intro) year_intro, year_order, sum(order_value) year_rev from ( select sku, order_value, extract(year from order_date) as year_order, min(extract(year from order_date)) over(partition by sku) year_intro from input_data ) group by sku, year_order
if applied to sample data in your question – output is