Gaps And Islands: Splitting Islands Based On External Table

Question

My scenario started off similar to a Island and Gaps problem, where I needed to find consecutive days of work. My current SQL query answers &#8220;ProductA was produced at LocationA from DateA through DateB,&#8230;

Accepted Answer

The straight-forward method is to fetch the effective price for each row of History and then generate gaps and islands taking price into account.It is not clear from the question what is the role of DestinationID. Sample data is of no help here.I&#8217;ll assume that we need to join and partition on both ProductID and DestinationID.The following query returns effective Price for each row from History.You need to add index to the PriceChange tableCREATE NONCLUSTERED INDEX [IX] ON [dbo].[PriceChange](    [ProductId] ASC,    [DestinationId] ASC,    [EffectiveDate] DESC)INCLUDE ([Price])for this query to work efficiently.Query for PricesSELECT    History.ProductId    ,History.DestinationId    ,History.ScheduledDate    ,History.Quantity    ,A.PriceFROM    History    OUTER APPLY    (        SELECT TOP(1)            PriceChange.Price        FROM            PriceChange        WHERE            PriceChange.ProductID = History.ProductID            AND PriceChange.DestinationId = History.DestinationId            AND PriceChange.EffectiveDate <= History.ScheduledDate        ORDER BY            PriceChange.EffectiveDate DESC    ) AS AORDER BY ProductID, ScheduledDate;For each row from History there will be one seek in this index to pick the correct price.This query returns:Prices+-----------+---------------+---------------+----------+-------+| ProductId | DestinationId | ScheduledDate | Quantity | Price |+-----------+---------------+---------------+----------+-------+|         0 |          1000 | 2018-04-01    |        5 |     1 ||         0 |          1000 | 2018-04-02    |       10 |     2 ||         0 |          1000 | 2018-04-03    |        7 |     2 ||         3 |          5000 | 2018-05-07    |       15 |     5 ||         3 |          5000 | 2018-05-08    |       23 |     5 ||         3 |          5000 | 2018-05-09    |       52 |     5 ||         3 |          5000 | 2018-05-10    |       12 |    20 ||         3 |          5000 | 2018-05-11    |       14 |    20 |+-----------+---------------+---------------+----------+-------+Now a standard gaps-and-island step to collapse consecutive days with the same price together. I use a difference of two row number sequences here.I&#8217;ve added some more rows to your sample data to see the gaps within the same ProductId.INSERT INTO History (ProductId, DestinationId, ScheduledDate, Quantity)VALUES  (0, 1000, '20180601', 5),  (0, 1000, '20180602', 10),  (0, 1000, '20180603', 7),  (3, 5000, '20180607', 15),  (3, 5000, '20180608', 23),  (3, 5000, '20180609', 52),  (3, 5000, '20180610', 12),  (3, 5000, '20180611', 14);If you run this intermediate query you&#8217;ll see how it works:WITHCTE_PricesAS(    SELECT        History.ProductId        ,History.DestinationId        ,History.ScheduledDate        ,History.Quantity        ,A.Price    FROM        History        OUTER APPLY        (            SELECT TOP(1)                PriceChange.Price            FROM                PriceChange            WHERE                PriceChange.ProductID = History.ProductID                AND PriceChange.DestinationId = History.DestinationId                AND PriceChange.EffectiveDate <= History.ScheduledDate            ORDER BY                PriceChange.EffectiveDate DESC        ) AS A),CTE_rnAS(    SELECT        ProductId        ,DestinationId        ,ScheduledDate        ,Quantity        ,Price        ,ROW_NUMBER() OVER (PARTITION BY ProductId, DestinationId, Price ORDER BY ScheduledDate) AS rn1        ,DATEDIFF(day, '20000101', ScheduledDate) AS rn2    FROM        CTE_Prices)SELECT *    ,rn2-rn1 AS DiffFROM CTE_rnIntermediate result+-----------+---------------+---------------+----------+-------+-----+------+------+| ProductId | DestinationId | ScheduledDate | Quantity | Price | rn1 | rn2  | Diff |+-----------+---------------+---------------+----------+-------+-----+------+------+|         0 |          1000 | 2018-04-01    |        5 |     1 |   1 | 6665 | 6664 ||         0 |          1000 | 2018-04-02    |       10 |     2 |   1 | 6666 | 6665 ||         0 |          1000 | 2018-04-03    |        7 |     2 |   2 | 6667 | 6665 ||         0 |          1000 | 2018-06-01    |        5 |     2 |   3 | 6726 | 6723 ||         0 |          1000 | 2018-06-02    |       10 |     2 |   4 | 6727 | 6723 ||         0 |          1000 | 2018-06-03    |        7 |     2 |   5 | 6728 | 6723 ||         3 |          5000 | 2018-05-07    |       15 |     5 |   1 | 6701 | 6700 ||         3 |          5000 | 2018-05-08    |       23 |     5 |   2 | 6702 | 6700 ||         3 |          5000 | 2018-05-09    |       52 |     5 |   3 | 6703 | 6700 ||         3 |          5000 | 2018-05-10    |       12 |    20 |   1 | 6704 | 6703 ||         3 |          5000 | 2018-05-11    |       14 |    20 |   2 | 6705 | 6703 ||         3 |          5000 | 2018-06-07    |       15 |    20 |   3 | 6732 | 6729 ||         3 |          5000 | 2018-06-08    |       23 |    20 |   4 | 6733 | 6729 ||         3 |          5000 | 2018-06-09    |       52 |    20 |   5 | 6734 | 6729 ||         3 |          5000 | 2018-06-10    |       12 |    20 |   6 | 6735 | 6729 ||         3 |          5000 | 2018-06-11    |       14 |    20 |   7 | 6736 | 6729 |+-----------+---------------+---------------+----------+-------+-----+------+------+Now simply group by the Diff to get one row per interval.Final queryWITHCTE_PricesAS(    SELECT        History.ProductId        ,History.DestinationId        ,History.ScheduledDate        ,History.Quantity        ,A.Price    FROM        History        OUTER APPLY        (            SELECT TOP(1)                PriceChange.Price            FROM                PriceChange            WHERE                PriceChange.ProductID = History.ProductID                AND PriceChange.DestinationId = History.DestinationId                AND PriceChange.EffectiveDate <= History.ScheduledDate            ORDER BY                PriceChange.EffectiveDate DESC        ) AS A),CTE_rnAS(    SELECT        ProductId        ,DestinationId        ,ScheduledDate        ,Quantity        ,Price        ,ROW_NUMBER() OVER (PARTITION BY ProductId, DestinationId, Price ORDER BY ScheduledDate) AS rn1        ,DATEDIFF(day, '20000101', ScheduledDate) AS rn2    FROM        CTE_Prices)SELECT    ProductId    ,DestinationId    ,MIN(ScheduledDate) AS StartDate    ,MAX(ScheduledDate) AS EndDate    ,SUM(Quantity) AS TotalQuantity    ,PriceFROM    CTE_rnGROUP BY    ProductId    ,DestinationId    ,Price    ,rn2-rn1ORDER BY    ProductID    ,DestinationId    ,StartDate;Final result+-----------+---------------+------------+------------+---------------+-------+| ProductId | DestinationId | StartDate  |  EndDate   | TotalQuantity | Price |+-----------+---------------+------------+------------+---------------+-------+|         0 |          1000 | 2018-04-01 | 2018-04-01 |             5 |     1 ||         0 |          1000 | 2018-04-02 | 2018-04-03 |            17 |     2 ||         0 |          1000 | 2018-06-01 | 2018-06-03 |            22 |     2 ||         3 |          5000 | 2018-05-07 | 2018-05-09 |            90 |     5 ||         3 |          5000 | 2018-05-10 | 2018-05-11 |            26 |    20 ||         3 |          5000 | 2018-06-07 | 2018-06-11 |           116 |    20 |+-----------+---------------+------------+------------+---------------+-------+

Advertisement

Answer