Skip to content
Advertisement

SQL getting balance point of two cumulative sums

Bellow I provide a code, which does the job on this interesting task to some extent, but may be poorly designed due to my SQL knowledge limits. The main problem is, that the query gives random (more exactly 3 different) results when executed. My guess is, that the rows within nested queries get ordered by “randomly” selected column and that is why the final result is different (balance point is order-dependent).

The inner SELECT with GROUP creates a list of r-dates and two cumulative sums, such as this:

rIndex  r             TotalPerDay   CumulativeSum1  CumulativeSum2
1       02.05.2019    92,81         92,81           0
2       03.05.2019    24,81         117,61          0
3       06.05.2019    43,79         161,40          60
4       07.05.2019    78,65         240,05          120
5       09.05.2019    33,99         274,04          180
6       10.05.2019    73,22         347,26          240
7       13.05.2019    19,24         366,50          300
8       14.05.2019    150,77        517,27          360
9       15.05.2019    22,69         539,95          420
10      16.05.2019    4,96          544,91          480
11      17.05.2019    17,45         562,36          540
12      20.05.2019    27,19         589,55          600
13      21.05.2019    12,45         602,00          660
14      22.05.2019    18,08         620,08          720
15      23.05.2019    3,49          623,57          780
16      24.05.2019    10,51         634,09          840
17      27.05.2019    6,19          640,28          900
18      28.05.2019    3,01          643,29          960
19      29.05.2019    2,68          645,97          1020
20      30.05.2019    184,51        830,48          1080

An attempt for sample data is in an attachment (removed due to comment bellow).

In the 2nd nested SELECT I find a balance-point, which is a (first) date, where CumulativeSum1 > CumulativeSum2. I then have to find an index of days with sums (because there are also days without data) and that is the final result; it’s the top-most SELECT in the query bellow:

DECLARE @eDate as Date
DECLARE @DayLimit INT
SET @DayLimit = 60  -- let's assume a constant here
SET @eDate = DATEFROMPARTS('2019','05','31')

-- get balance point INDEX over non-empty days
SELECT (SELECT COUNT(cDate) FROM Calendar WHERE KindOfDay = 'BANKDAY' AND cDate BETWEEN GETDATE() AND SRC3.BalanceDate) as rIndex
FROM
    (    
    SELECT TOP 1 SRC2.rDate   -- get first balance point (date)
    FROM
        (
        SELECT 
             ROW_NUMBER() OVER (ORDER BY SRC.rDate) as RowNo
            ,SRC.rDate 
            ,SRC.TotalPerDay      -- not required for processing, included just for info and check
            ,(SELECT (SUM((eTime-ISNULL(rDura,0))/60)) FROM MyTable1 as MT WHERE MT.r <= SRC.rDate AND MT.r < @eDate)         as CumulativeSum1
            ,((SELECT COUNT(cDate) FROM Calendar WHERE KindOfDay = 'BANKDAY' AND cDate BETWEEN GETDATE() AND SRC.rDate) * @DayLimit) as CumulativeSum2
        FROM (
            SELECT   
                  CASE  
                      WHEN CAST(r as DATE) < CAST(GETDATE() as date)  
                      THEN DATEADD(dd,-1,CAST(GETDATE() as date))                
                      ELSE CAST(r as date)                           
                  END as rDate, 
                  SUM((eTime-ISNULL(rDura,0))/60) as TotalPerDay      
            FROM MyTable1 
            WHERE r < @eDate
            GROUP BY  -- group by non-empty dates, group all past dates to yesterday
                   CASE  
                       WHEN CAST(r as DATE) < CAST(GETDATE() as date)  
                      THEN DATEADD(dd,-1,CAST(GETDATE() as date))               
                      ELSE CAST(r as date)                             
                   END                                
        ) as SRC                          
        --ORDER BY rDate
        ) as SRC2  -- compiled list of sums per day
    WHERE SRC2.CumulativeSum2 > SRC2.CumulativeSum1;    -- balance condition
) as SRC3

I humbly ask for suggestions for the obvious issues:

  • how could I ensure order of rows in the nested query to get reliable result?
  • are there obvious mistakes in my query design and how to improve it?

Also, I just realized there’s a discrepancy in the topmost query, where I get index over bank days, hower the index should be over non-empty bank days…

Some sample data:

-------  CALENDAR TABLE  --------------------------------------------------------------------

CREATE TABLE [dbo].[Calendar](
    [cDate] [datetime] NOT NULL,
    [KindOfDay] [varchar](10) NOT NULL
PRIMARY KEY CLUSTERED 
(
    [cDate] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]


INSERT INTO [dbo].[Calendar] ([cDate],[KindOfDay])  VALUES 
    ('2019-04-20 00:00:00.000', 'SATURDAY'),
    ('2019-04-21 00:00:00.000', 'SUNDAY'),
    ('2019-04-22 00:00:00.000', 'HOLIDAY'),
    ('2019-04-23 00:00:00.000', 'BANKDAY'),
    ('2019-04-24 00:00:00.000', 'BANKDAY'),
    ('2019-04-25 00:00:00.000', 'BANKDAY'),
    ('2019-04-26 00:00:00.000', 'BANKDAY'),
    ('2019-04-27 00:00:00.000', 'SATURDAY'),
    ('2019-04-28 00:00:00.000', 'SUNDAY'),
    ('2019-04-29 00:00:00.000', 'BANKDAY'),
    ('2019-04-30 00:00:00.000', 'BANKDAY'),
    ('2019-05-01 00:00:00.000', 'HOLIDAY'),
    ('2019-05-02 00:00:00.000', 'BANKDAY'),
    ('2019-05-03 00:00:00.000', 'BANKDAY'),
    ('2019-05-04 00:00:00.000', 'SATURDAY'),
    ('2019-05-05 00:00:00.000', 'SUNDAY'),
    ('2019-05-06 00:00:00.000', 'BANKDAY'),
    ('2019-05-07 00:00:00.000', 'BANKDAY'),
    ('2019-05-08 00:00:00.000', 'HOLIDAY'),
    ('2019-05-09 00:00:00.000', 'BANKDAY'),
    ('2019-05-10 00:00:00.000', 'BANKDAY'),
    ('2019-05-11 00:00:00.000', 'SATURDAY'),
    ('2019-05-12 00:00:00.000', 'SUNDAY'),
    ('2019-05-13 00:00:00.000', 'BANKDAY'),
    ('2019-05-14 00:00:00.000', 'BANKDAY'),
    ('2019-05-15 00:00:00.000', 'BANKDAY'),
    ('2019-05-16 00:00:00.000', 'BANKDAY'),
    ('2019-05-17 00:00:00.000', 'BANKDAY'),
    ('2019-05-18 00:00:00.000', 'SATURDAY'),
    ('2019-05-19 00:00:00.000', 'SUNDAY'),
    ('2019-05-20 00:00:00.000', 'BANKDAY'),
    ('2019-05-21 00:00:00.000', 'BANKDAY'),
    ('2019-05-22 00:00:00.000', 'BANKDAY'),
    ('2019-05-23 00:00:00.000', 'BANKDAY'),
    ('2019-05-24 00:00:00.000', 'BANKDAY'),
    ('2019-05-25 00:00:00.000', 'SATURDAY'),
    ('2019-05-26 00:00:00.000', 'SUNDAY'),
    ('2019-05-27 00:00:00.000', 'BANKDAY'),
    ('2019-05-28 00:00:00.000', 'BANKDAY'),
    ('2019-05-29 00:00:00.000', 'BANKDAY'),
    ('2019-05-30 00:00:00.000', 'BANKDAY'),
    ('2019-05-31 00:00:00.000', 'BANKDAY'),
    ('2019-06-01 00:00:00.000', 'SATURDAY'),
    ('2019-06-02 00:00:00.000', 'SUNDAY'),
    ('2019-06-03 00:00:00.000', 'BANKDAY'),
    ('2019-06-04 00:00:00.000', 'BANKDAY'),
    ('2019-06-05 00:00:00.000', 'BANKDAY'),
    ('2019-06-06 00:00:00.000', 'BANKDAY'),
    ('2019-06-07 00:00:00.000', 'BANKDAY'),
    ('2019-06-08 00:00:00.000', 'SATURDAY'),
    ('2019-06-09 00:00:00.000', 'SUNDAY'),
    ('2019-06-10 00:00:00.000', 'BANKDAY'),
    ('2019-06-11 00:00:00.000', 'BANKDAY'),
    ('2019-06-12 00:00:00.000', 'BANKDAY'),
    ('2019-06-13 00:00:00.000', 'BANKDAY'),
    ('2019-06-14 00:00:00.000', 'BANKDAY'),
    ('2019-06-15 00:00:00.000', 'SATURDAY'),
    ('2019-06-16 00:00:00.000', 'SUNDAY'),
    ('2019-06-17 00:00:00.000', 'BANKDAY'),
    ('2019-06-18 00:00:00.000', 'BANKDAY'),
    ('2019-06-19 00:00:00.000', 'BANKDAY'),
    ('2019-06-20 00:00:00.000', 'BANKDAY')
GO



-------  MyTable1 TABLE  --------------------------------------------------------------------

CREATE TABLE [dbo].[MyTable1](
    [ID] [int] NOT NULL,
    [rDate] [date] NOT NULL,
    [eTime] [decimal](12,6) NOT NULL,
    [rDura] [date] NULL
) 



INSERT INTO MyTable1 (ID, rDura, eTime, rDate) VALUES
    (17008431,NULL,0.1855,'2019-05-02'), 
    (17008477,NULL,0.059,'2019-05-02'), 
    (17008500,NULL,0.329667,'2019-05-02'), 
    (17090449,NULL,3.3195,'2019-05-02'), 
    (16888594,NULL,13.830667,'2019-04-26'), 
    (16888681,NULL,12.6635,'2019-04-26'), 
    (16888722,NULL,8.154667,'2019-05-07'), 
    (16888750,NULL,7.83,'2019-05-07'), 
    (16888766,NULL,5.22,'2019-05-07'), 
    (16955798,NULL,12.35,'2019-05-07'), 
    (17108201,NULL,1.669833,'2019-05-07'), 
    (17110834,NULL,2.596667,'2019-05-02'), 
    (17111001,NULL,0.814667,'2019-05-06'), 
    (16893842,NULL,1.053,'2019-05-07'), 
    (16951779,NULL,2.720833,'2019-05-03'), 
    (16951821,NULL,4.042333,'2019-05-06'), 
    (17017058,NULL,0.227333,'2019-05-02'), 
    (17017060,NULL,1.06,'2019-05-02'), 
    (17017066,NULL,1.869333,'2019-05-02'), 
    (17019289,NULL,0.835667,'2019-04-26'), 
    (17020295,NULL,3.983333,'2019-04-21'), 
    (17106404,105,3.3545,'2019-04-29'), 
    (17107843,NULL,2.815167,'2019-05-07'), 
    (16725584,NULL,0.693,'2019-04-25'), 
    (17101197,NULL,3.906667,'2019-04-30'), 
    (17101993,NULL,0.571667,'2019-05-06'), 
    (17102225,NULL,3.048833,'2019-04-30'), 
    (17102482,NULL,7.5945,'2019-05-10'), 
    (16974196,NULL,1.633333,'2019-05-06'), 
    (17113406,NULL,0.871833,'2019-05-02'), 
    (17113408,NULL,0.749833,'2019-05-02'), 
    (17113784,NULL,1.961333,'2019-05-03'), 
    (17120601,NULL,4.033333,'2019-05-06'), 
    (17120609,NULL,3.983333,'2019-05-06'), 
    (17120618,NULL,2.626667,'2019-05-06'), 
    (17120626,NULL,2.64,'2019-05-06'), 
    (17120628,NULL,3.684167,'2019-05-06'), 
    (17121720,NULL,2.235,'2019-04-30'), 
    (17058455,NULL,5.806667,'2019-04-29'), 
    (17059476,NULL,2.264833,'2019-05-22'), 
    (17059478,NULL,182.603667,'2019-05-30'), 
    (17065386,NULL,5.539667,'2019-05-10'), 
    (16927091,NULL,1.381,'2019-05-14'), 
    (16927093,NULL,112.304685,'2019-05-14'), 
    (16991456,NULL,0.931667,'2019-04-29'), 
    (17122394,NULL,1.560167,'2019-05-03'), 
    (17126711,NULL,4.046,'2019-05-03'), 
    (16935823,NULL,0.359,'2019-04-25'), 
    (17069727,NULL,1.952833,'2019-05-03'), 
    (17069870,NULL,1.742333,'2019-05-02'), 
    (17070555,NULL,5.416667,'2019-05-02'), 
    (17070557,NULL,3.894167,'2019-05-02'), 
    (17070851,NULL,2.64,'2019-04-23'), 
    (17073724,NULL,0.737667,'2019-05-03'), 
    (17074763,NULL,1.413833,'2019-05-02'), 
    (17131824,NULL,4.258,'2019-05-10'), 
    (17132133,NULL,0.257667,'2019-05-14'), 
    (17132865,NULL,2.769833,'2019-05-17'), 
    (17138082,NULL,7.866667,'2019-05-31'), 
    (17139196,NULL,5.860167,'2019-05-03'), 
    (17139200,NULL,1.479667,'2019-05-03'), 
    (16983337,NULL,2.951667,'2019-05-02'), 
    (17028542,NULL,0.680333,'2019-05-13'), 
    (16823160,NULL,5,'2019-05-06'), 
    (16823168,NULL,5,'2019-05-06'), 
    (16823182,NULL,5,'2019-05-06'), 
    (16823192,NULL,5,'2019-05-06'), 
    (16906776,NULL,0.8635,'2019-05-02'), 
    (17082286,NULL,3.333333,'2019-05-09'), 
    (17083776,NULL,2.317167,'2019-04-25'), 
    (17083778,NULL,1.447167,'2019-05-02'), 
    (17084568,NULL,0.2375,'2019-05-02'), 
    (17154415,NULL,2.64,'2019-05-14'), 
    (17154425,NULL,2.626667,'2019-05-14'), 
    (17154453,NULL,0.052,'2019-05-06'), 
    (17155029,NULL,3.256667,'2019-05-22'), 
    (17157159,NULL,1.333333,'2019-05-15'), 
    (16994233,NULL,0.252167,'2019-04-29'), 
    (17039767,NULL,1.401667,'2019-05-10'), 
    (17040346,NULL,4.021667,'2019-05-09'), 
    (17040815,NULL,1.2675,'2019-05-16'), 
    (17042063,NULL,0.213333,'2019-05-03'), 
    (17050144,NULL,0.976667,'2019-05-02'), 
    (17050150,NULL,0.837167,'2019-05-20'), 
    (17051422,NULL,1.826,'2019-05-07'), 
    (17142464,NULL,0.464333,'2019-05-06'), 
    (17145501,NULL,4.745333,'2019-06-06'), 
    (17145980,NULL,0.195167,'2019-05-07'), 
    (17145999,NULL,1.330833,'2019-05-07'), 
    (17146001,NULL,1.503833,'2019-05-06'), 
    (17146011,NULL,1.22,'2019-05-03'), 
    (17146017,NULL,0.373,'2019-05-07'), 
    (17146023,NULL,0.5745,'2019-05-03'), 
    (17146127,NULL,1.7835,'2019-05-15'), 
    (17146131,NULL,13.5595,'2019-05-07'), 
    (17152617,NULL,4.535667,'2019-05-10'), 
    (17154390,NULL,3.983333,'2019-05-14'), 
    (17154398,NULL,5.416667,'2019-05-14'), 
    (17154400,NULL,3.684167,'2019-05-14')

 GO

Advertisement

Answer

Suggested ROW_NUMBER() did not help to overcome the problĂ©m on it’s own. I had to split task into two steps: Fist, I had to set a variable @bDate to store the result of the 2 inner nested SELECTs and subsequently find the index of this date in separate SELECT step.

DECLARE @eDate as Date
DECLARE @DayLimit INT
SET @DayLimit = 60  -- let's assume a constant here
SET @eDate = DATEFROMPARTS('2019','05','31')

-- get balance point INDEX over non-empty days    
SELECT TOP 1 @bDate = SRC2.rDate   -- get first balance point (date)
FROM
    (
    SELECT 
            ROW_NUMBER() OVER (ORDER BY SRC.rDate) as RowNo
        ,SRC.rDate 
        ,SRC.TotalPerDay      -- not required for processing, included just for info and check
        ,(SELECT (SUM((eTime-ISNULL(rDura,0))/60)) FROM MyTable1 as MT WHERE MT.r <= SRC.rDate AND MT.r < @eDate)         as CumulativeSum1
        ,((SELECT COUNT(cDate) FROM Calendar WHERE KindOfDay = 'BANKDAY' AND cDate BETWEEN GETDATE() AND SRC.rDate) * @DayLimit) as CumulativeSum2
    FROM (
        SELECT   
                CASE  
                    WHEN CAST(r as DATE) < CAST(GETDATE() as date)  
                    THEN DATEADD(dd,-1,CAST(GETDATE() as date))                
                    ELSE CAST(r as date)                           
                END as rDate, 
                SUM((eTime-ISNULL(rDura,0))/60) as TotalPerDay      
        FROM MyTable1 
        WHERE r < @eDate
        GROUP BY  -- group by non-empty dates, group all past dates to yesterday
                CASE  
                    WHEN CAST(r as DATE) < CAST(GETDATE() as date)  
                    THEN DATEADD(dd,-1,CAST(GETDATE() as date))               
                    ELSE CAST(r as date)                             
                END                                
    ) as SRC                          
    --ORDER BY rDate
    ) as SRC2  -- compiled list of sums per day
WHERE SRC2.CumulativeSum2 > SRC2.CumulativeSum1;    -- balance condition

-- get the index of bank day from Today
SELECT (SELECT COUNT(cDate) FROM Calendar WHERE KindOfDay = 'BANKDAY' AND cDate BETWEEN GETDATE() AND @bDate); 
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement