Is there a way to easily retrieve last years data during volume aggregation, grouped by year, month.
Sample of code below (from BQ). It shows an error in the subquery
WHERE clause expression references t1.date
which is neither grouped nor aggregated
x
SELECT
EXTRACT(YEAR FROM t1.date) AS year,
EXTRACT(MONTH FROM t1.date) AS month,
t1.ProductId AS product,
SUM(t1.Quantity) AS UnitsSold_TY,
(SELECT
SUM(Quantity)
FROM `new-project-jun21.sales.sales_info`
WHERE
EXTRACT(YEAR FROM date) = EXTRACT(YEAR FROM t1.date) - 1 AND
EXTRACT(MONTH FROM date) = EXTRACT(MONTH FROM t1.date) AND
ProductId = t1.ProductId
GROUP BY
EXTRACT(YEAR FROM date),
EXTRACT(MONTH FROM date),
EXTRACT(MONTH FROM t1.date),
ProductId) AS UnitsSold_LY
FROM `new-project-jun21.sales.sales_info` AS t1
GROUP BY
year,
month,
product
ORDER BY product, year, month
Advertisement
Answer
If you have data every month, then you can use lag()
. I would recommend using date_trunc()
instead of separating the year and date components. So:
SELECT productId,
DATE_TRUNC(date, INTERVAL 1 MONTH) as yyyymm
SUM(Quantity),
LAG(SUM(Quantity), 12) OVER (PARTITION BY ProductId ORDER BY MIN(date)) as yoy
FROM `new-project-jun21.sales.sales_info`
GROUP BY product_id, DATE_TRUNC(date, INTERVAL 1 MONTH);
ORDER BY product, yyyymm;
If you have missing months for some products, then you can still use window functions, but the logic is a bit more complicated.
EDIT:
If you don’t have data every month, then you can use a RANGE
specification, but you need a month counter:
SELECT productId,
DATE_TRUNC(date, INTERVAL 1 MONTH) as yyyymm
SUM(Quantity),
MAX(SUM(Quantity)) OVER (PARTITION BY ProductId
ORDER BY DATE_DIFF(MIN(DATE), DATE '2000-01-01', month)
RANGE BETWEEN 12 PRECEDING AND 12 PRECEDING
) as yoy
FROM `new-project-jun21.sales.sales_info`
GROUP BY product_id, DATE_TRUNC(date, INTERVAL 1 MONTH);
ORDER BY product, yyyymm;