Skip to content
Advertisement

How to retrieve last year data on line-by-line basis (main set grouped by year, month & aggregated on volume)?

Is there a way to easily retrieve last years data during volume aggregation, grouped by year, month.

Sample of code below (from BQ). It shows an error in the subquery WHERE clause expression references t1.date which is neither grouped nor aggregated

SELECT
EXTRACT(YEAR FROM t1.date) AS year,
EXTRACT(MONTH FROM t1.date) AS month,
t1.ProductId AS product,
SUM(t1.Quantity) AS UnitsSold_TY,
(SELECT
    SUM(Quantity)
FROM `new-project-jun21.sales.sales_info`
WHERE
    EXTRACT(YEAR FROM date) =  EXTRACT(YEAR FROM t1.date) - 1 AND
    EXTRACT(MONTH FROM date) = EXTRACT(MONTH FROM t1.date) AND
    ProductId = t1.ProductId
GROUP BY
    EXTRACT(YEAR FROM date),
    EXTRACT(MONTH FROM date),
    EXTRACT(MONTH FROM t1.date),
    ProductId) AS UnitsSold_LY
FROM `new-project-jun21.sales.sales_info` AS t1 
GROUP BY
year,
month,
product
ORDER BY product, year, month

Advertisement

Answer

If you have data every month, then you can use lag(). I would recommend using date_trunc() instead of separating the year and date components. So:

SELECT productId,
       DATE_TRUNC(date, INTERVAL 1 MONTH) as yyyymm
       SUM(Quantity),
       LAG(SUM(Quantity), 12) OVER (PARTITION BY ProductId ORDER BY MIN(date)) as yoy
FROM `new-project-jun21.sales.sales_info`
GROUP BY product_id, DATE_TRUNC(date, INTERVAL 1 MONTH);
ORDER BY product, yyyymm;

If you have missing months for some products, then you can still use window functions, but the logic is a bit more complicated.

EDIT:

If you don’t have data every month, then you can use a RANGE specification, but you need a month counter:

SELECT productId,
       DATE_TRUNC(date, INTERVAL 1 MONTH) as yyyymm
       SUM(Quantity),
       MAX(SUM(Quantity)) OVER (PARTITION BY ProductId
                                ORDER BY DATE_DIFF(MIN(DATE), DATE '2000-01-01', month)
                                RANGE BETWEEN 12 PRECEDING AND 12 PRECEDING
                               ) as yoy
FROM `new-project-jun21.sales.sales_info`
GROUP BY product_id, DATE_TRUNC(date, INTERVAL 1 MONTH);
ORDER BY product, yyyymm;
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement