Skip to content
Advertisement

How to aggregate values in two different columns under different criteria?

Here’s the table of data I’d like to work on:

CUST_REF ACC_NO JOINT_ACC PRODUCT NUM_OF_ACC CALC_FEE ACTUAL_FEE DIFF
100 ABC123 N ACCOUNT 4 140.68 104.14 36.54
100 ABC456 N STOCKS 4 41.72 30.24 11.48
100 XYZ123 N ISA 4 48.26 32.27 15.99
100 XYZ444 Y PENSION 4 3.15 0.00 3.15

I now need to work in the impact of a fee cap, which varies based on whether or not the account is jointly held. Y in the JOINT_ACC field denotes a joint account.

Fees are capped at 166.67 per month per client for all sole accounts (JOINT_ACC = ‘N’) – i.e. the total fee across all sole accounts cannot exceed 166.67. However, for joint accounts, the cap is applied at the individual account level. In other words, the CALC_FEE when JOINT_ACC = ‘Y’ cannot exceed 166.67 – in this case the account has a fee well below 166.67 so it’s left as is (I don’t care about the actual fee being 0, that’s a separate question for the great and the good).

In this example, the actual fee has somehow already considered this information – if you add up the three entries in the ACTUAL_FEE column, they add up to 166.67.

I therefore want to:

  • Aggregate the CALC_FEE entries based on the JOINT_ACC flag and apply CASE-based conditions to work in the fee cap
  • Aggregate the ACTUAL_FEE across sole and joint accounts (I don’t want to apply a fee cap here as my objective is to compare the calculated fee with the fees actually charged).

So in effect my ideal output table would look like this:

CUST_REF ACC_NO JOINT_ACC PRODUCT NUM_OF_ACC CALC_FEE ACTUAL_FEE DIFF
100 ABC123 N ACCOUNT 4 166.67 166.67 0.00
100 ABC456 N STOCKS 4 166.67 166.67 0.00
100 XYZ123 N ISA 4 166.67 166.67 0.00
100 XYZ444 Y PENSION 4 3.15 0.00 0.00

Here’s what I’ve tried so far:

    SELECT 
        A.CUST_REF,
        A.ACC_NO,
        A.JOINT_ACC,
        A.PRODUCT,
        A.NUM_OF_ACC,
        SUM(A.ACTUAL_FEE) OVER (PARTITION BY A.CUST_REF, A.ACC_NO, A.JOINT_ACC) AS FEES_CHARGED,
        
        CASE
           WHEN A.JOINT_ACC_IND = 'N'
             THEN
                (CASE
                    WHEN (SUM(B.CALC_FEE) OVER PARTITION BY (A.CUST_REF, A.ACC_NO)) > 166.67 THEN (166.67)
                    ELSE (SUM(B.CALC_FEE) OVER PARTITION BY (A.CUST_REF, A.ACC_NO))
                 END)
           WHEN A.JOINT_ACC_IND = 'Y'
             THEN
                (CASE
                    WHEN (C.CALC_FEE) > 166.67 THEN (166.67)
                    ELSE (C.CALC_FEE)
                 END)
           END
               AS ADJ_FEE_CALC,

         ((CASE
           WHEN A.JOINT_ACC_IND = 'N'
             THEN
                (CASE
                    WHEN (SUM(B.CALC_FEE) OVER PARTITION BY (A.CUST_REF, A.ACC_NO)) > 166.67 THEN (166.67)
                    ELSE (SUM(B.CALC_FEE) OVER PARTITION BY (A.CUST_REF, A.ACC_NO))
                 END)
           WHEN A.JOINT_ACC_IND = 'Y'
             THEN
                (CASE
                    WHEN (C.CALC_FEE) > 166.67 THEN (166.67)
                    ELSE (C.CALC_FEE)
                 END)
           END) - (SUM(A.ACTUAL_FEE) OVER (PARTITION BY A.CUST_REF, A.ACC_NO, A.JOINT_ACC))) AS DIFF

FROM V_FEES_TABLE A

     LEFT JOIN V_FEES_TABLE B ON A.CUST_REF = B.CUST_REF AND A.ACC_NO = B.ACC_NO AND B.JOINT_ACC = 'N'
     LEFT JOIN V_FEES_TABLE C ON A.CUST_REF = C.CUST_REF AND A.ACC_NO = C.ACC_NO AND C.JOINT_ACC = 'Y'

This query is taking a long time to run (over an hour when I checked a few minutes ago). Clearly I’m doing something fundamentally wrong / inefficient. I don’t know if it makes a difference but V_FEES_TABLE is a view that is built atop another view, which in turn references core tables in the DB.

Please help! Thanks in advance.


EDIT:

I have a few cases where the code suggested below throws up false positives:

CUST_REF ACC_NO JOINT_ACC PRODUCT NUM_OF_ACC CALC_FEE ACTUAL_FEE CUST_FEE_CALC ACTUAL_CUST_FEE_CHARGED DIFF
100 ABC123 N ACCOUNT 1 95.45 94.29 166.67 379.3 -212.63
100 ABC123 N ACCOUNT 1 95.45 95.36 166.67 379.3 -212.63

The code I used to rerun the work was:

    SELECT 
            A.CUST_REF,
            A.ACC_NO,
            A.JOINT_ACC,
            A.PRODUCT,
            A.NUM_OF_ACC,
            A.CALC_FEE,
            A.ACTUAL_FEE,
    
    
    (CASE WHEN JOINT_ACC = 'Y' AND CALC_FEE < 166.67
                 THEN CALC_FEE
                 WHEN JOINT_ACC = 'Y'
                 THEN 166.67
                 WHEN SUM(CALC_FEE) OVER (PARTITION BY CUST_REF,    JOINT_ACC) < 166.67
                 THEN SUM(CALC_FEE) OVER (PARTITION BY CUST_REF, JOINT_ACC)            
                 ELSE 166.67
             END) as CUST_FEE_CALC,
    
            SUM(A.ACTUAL_FEE) OVER (PARTITION BY A.CUST_REF, A.JOINT_ACC) AS ACTUAL_CUST_FEE_CHARGED,

(CASE WHEN JOINT_ACC = 'Y' AND CALC_FEE < 166.67
                 THEN CALC_FEE
                 WHEN JOINT_ACC = 'Y'
                 THEN 166.67
                 WHEN SUM(CALC_FEE) OVER (PARTITION BY CUST_REF,    JOINT_ACC) < 166.67
                 THEN SUM(CALC_FEE) OVER (PARTITION BY CUST_REF, JOINT_ACC)            
                 ELSE 166.67
             END) - SUM(A.ACTUAL_FEE) OVER (PARTITION BY A.CUST_REF, A.JOINT_ACC) 
              as DIFF

FROM FEES_TABLE A

It would appear that there are instances where the same account has been billed twice in the same period, for different amounts – effectively, the SQL treats this as two separate accounts for the purposes of the aggregation. This is inadvertently distorting my own calculation as it adds up 95.45 twice and presents the client-level fee as 166.67, given the cap that has been worked in above based on Gordon’s solution.

I want SQL to leave the calculated client fee aggregated at the account level, but add up the actual fees charged as I’m not sure why I see the different figures of 94.29 and 95.36 here. As such I’d like to see:

CUST_REF ACC_NO JOINT_ACC PRODUCT NUM_OF_ACC CALC_FEE ACTUAL_FEE CUST_FEE_CALC ACTUAL_CUST_FEE_CHARGED DIFF
100 ABC123 N ACCOUNT 1 95.45 94.29 95.45 189.65 -94.20
100 ABC123 N ACCOUNT 1 95.45 95.36 95.45 189.65 -94.20

I tried to modify the PARTITION BY criteria to also include the ACC_NO column but that hasn’t been successful. Any ideas?

Advertisement

Answer

I don’t understand why any joins are necessary, just window functions. To get the calculated fees:

SELECT FT.*,
       (CASE WHEN JOINT_ACC = 'Y' AND CALC_FEE < 166.67
             THEN CALC_FEE
             WHEN JOINT_ACC = 'Y'
             THEN 166.67
             WHEN SUM(CALC_FEE) OVER (PARTITION BY CUST_REF, JOINT_ACC) < 166.67
             THEN SUM(CALC_FEE) OVER (PARTITION BY CUST_REF, JOINT_ACC)            
             ELSE 166.67
         END) as IMPUTED_CALC_FEE
FROM V_FEES_TABLE FT
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement