Is there a way to easily retrieve last years data during volume aggregation, grouped by year, month.
Sample of code below (from BQ). It shows an error in the subquery
WHERE clause expression references t1.date
which is neither grouped nor aggregated
SELECT EXTRACT(YEAR FROM t1.date) AS year, EXTRACT(MONTH FROM t1.date) AS month, t1.ProductId AS product, SUM(t1.Quantity) AS UnitsSold_TY, (SELECT SUM(Quantity) FROM `new-project-jun21.sales.sales_info` WHERE EXTRACT(YEAR FROM date) = EXTRACT(YEAR FROM t1.date) - 1 AND EXTRACT(MONTH FROM date) = EXTRACT(MONTH FROM t1.date) AND ProductId = t1.ProductId GROUP BY EXTRACT(YEAR FROM date), EXTRACT(MONTH FROM date), EXTRACT(MONTH FROM t1.date), ProductId) AS UnitsSold_LY FROM `new-project-jun21.sales.sales_info` AS t1 GROUP BY year, month, product ORDER BY product, year, month
Advertisement
Answer
If you have data every month, then you can use lag()
. I would recommend using date_trunc()
instead of separating the year and date components. So:
SELECT productId, DATE_TRUNC(date, INTERVAL 1 MONTH) as yyyymm SUM(Quantity), LAG(SUM(Quantity), 12) OVER (PARTITION BY ProductId ORDER BY MIN(date)) as yoy FROM `new-project-jun21.sales.sales_info` GROUP BY product_id, DATE_TRUNC(date, INTERVAL 1 MONTH); ORDER BY product, yyyymm;
If you have missing months for some products, then you can still use window functions, but the logic is a bit more complicated.
EDIT:
If you don’t have data every month, then you can use a RANGE
specification, but you need a month counter:
SELECT productId, DATE_TRUNC(date, INTERVAL 1 MONTH) as yyyymm SUM(Quantity), MAX(SUM(Quantity)) OVER (PARTITION BY ProductId ORDER BY DATE_DIFF(MIN(DATE), DATE '2000-01-01', month) RANGE BETWEEN 12 PRECEDING AND 12 PRECEDING ) as yoy FROM `new-project-jun21.sales.sales_info` GROUP BY product_id, DATE_TRUNC(date, INTERVAL 1 MONTH); ORDER BY product, yyyymm;