Skip to content
Advertisement

How to optimize my query speed (avoid using subselect for every row)?

I have a table called CisLinkLoadedData. Is has Distributor, Network, Product, DocumentDate, Weight, AmountCP and Quantity columns. It used to store some product daily sales. AmountCP / Quantity is the price for the product at certain date. There are promo and regular sales, but no flag for it. We can tell if certain record is regular or promo by comparing it’s price with the maximum recorded price within month. I did explained it on this picture. enter image description here

I need to make a query to display summarized regular and promo sales of certain product per month. Well, I made it, but it very slow (6 minutes to execute at 1.6 millions records). I suspect this is because I use subquery to determine max price for every record, but I don’t know how to make it another way.

This is what I made:

SELECT
    Distributor,
    Network,
    Product,
    cast(month(DocumentDate) as VARCHAR) + '.' + cast(year(DocumentDate) as VARCHAR) AS MonthYear,
    SUM(Weight) AS MonthlyWeight,
    IsPromo
FROM (SELECT
        main_clld.Distributor,
        main_clld.Network,
        main_clld.Product,
        main_clld.DocumentDate,
        main_clld.Weight,
        main_clld.Quantity,
        main_clld.AmountCP,
        CASE WHEN (main_clld.AmountCP / main_clld.Quantity) < (SELECT MAX(sub_clld.AmountCP / NULLIF(sub_clld.Quantity, 0)) FROM CisLinkLoadedData AS sub_clld WHERE sub_clld.Distributor = main_clld.Distributor AND sub_clld.Network = main_clld.Network AND sub_clld.Product = main_clld.Product AND cast(month(sub_clld.DocumentDate) as VARCHAR) + '.' + cast(year(sub_clld.DocumentDate) as VARCHAR) = cast(month(main_clld.DocumentDate) as VARCHAR) + '.' + cast(year(main_clld.DocumentDate) as VARCHAR) AND sub_clld.Quantity > 0 AND sub_clld.GCRecord IS NULL) THEN 1 ELSE 0 END AS IsPromo
    FROM CisLinkLoadedData AS main_clld
    WHERE main_clld.Quantity > 0 AND main_clld.GCRecord IS NULL) AS bad_query
GROUP BY
    Distributor,
    Network,
    Product,
    cast(month(DocumentDate) as VARCHAR) + '.' + cast(year(DocumentDate) as VARCHAR),
    IsPromo;

What is possible to do in such case? By the way, if you can do result table with another structure like that (Distributor, Network, Product, MonthYear, RegularWeight, PromoWeight) – it’s even better. This is what I tried initially, but failed. I use Microsoft SQL Server.

Advertisement

Answer

Rather than a correlated subquery, you could use a windowed function to retrieve the maximum price per group (each group is defined by the partition by clause):

MAX(main_clld.AmountCP / NULLIF(main_clld.Quantity, 0)) 
 OVER(PARTITION BY main_clld.Distributor, main_clld.Network, 
                   main_clld.Product, EOMONTH(main_clld.DocumentDate))

I think your full query would end up something like:

SELECT
    Distributor,
    Network,
    Product,
    MonthYear,
    SUM(Weight) AS MonthlyWeight,
    IsPromo
FROM (SELECT
        main_clld.Distributor,
        main_clld.Network,
        main_clld.Product,
        main_clld.DocumentDate,
        main_clld.Weight,
        main_clld.Quantity,
        main_clld.AmountCP,
        CAST(MONTH(DocumentDate) AS VARCHAR(2)) + '.' + cast(year(DocumentDate) as VARCHAR(2)) AS MonthYear,
        CASE WHEN (main_clld.AmountCP / main_clld.Quantity) < MAX(main_clld.AmountCP / NULLIF(main_clld.Quantity, 0)) 
                                                                OVER(PARTITION BY main_clld.Distributor, main_clld.Network,
                                                                                    main_clld.Product, EOMONTH(main_clld.DocumentDate)) 
                THEN 1 ELSE 0 END AS IsPromo
    FROM CisLinkLoadedData AS main_clld
    WHERE main_clld.Quantity > 0 
    AND main_clld.GCRecord IS NULL
    ) AS bad_query
GROUP BY
    Distributor,
    Network,
    Product,
    MonthYear,
    IsPromo;
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement