Skip to content
Advertisement

How to use SQL math functions with multiple joins

in SQL Server I have two fact tables. FactA I will use COUNT(DISTINCT) and need to use two joins to get the needed GROUP BY and WHERE data from DimB. FactB uses SUM and needs only one join to DimB. Now I can make these queries work one by one, but I am at complete lost, trying to merge them into one query.

Here is an extremely simplified example data:

    CREATE TABLE FactA
    ([Customer] varchar(3), [Dept] varchar(5))
;
    
INSERT INTO FactA
    ([Customer], [Dept])
VALUES
    ('Aaa', 'DeptA'),
    ('Bbb', 'DeptA'),
    ('Ccc', 'DeptA'),
    ('Ddd', 'DeptB'),
    ('Fff', 'DeptB'),
    ('Aaa', 'DeptC'),
    ('Aaa', 'DeptC')
;

CREATE TABLE DimA
    ([Dept] varchar(5), [Area] varchar(2))
;
    
INSERT INTO DimA
    ([Dept], [Area])
VALUES
    ('DeptA', 'SW'),
    ('DeptB', 'NW'),
    ('DeptC', 'SE')
;


CREATE TABLE DimB
    ([Area] varchar(2), [Code] varchar(2), [PO] varchar(2))
;
    
INSERT INTO DimB
    ([Area], [Code], [PO])
VALUES
    ('SW', 'XA', 'Z1'),
    ('NW', 'XB', 'Z3'),
    ('SE', 'XC', 'Z5')
;


CREATE TABLE FactB
    ([Sales] int, [Code] VARCHAR(2))
;
    
INSERT INTO FactB
    ([Sales], [Code])
VALUES
    (20, 'XA'),
    (10, 'XA'),
    (25, 'XA'),
    (30, 'XB'),
    (40, 'XB'),
    (15, 'XC'),
    (20, 'XC'),
    (35, 'XC')
;

Example queries that work by themselves and give expected results are:

SELECT DimB.PO,COUNT(DISTINCT FactA.Customer) As Customers
FROM FactA
JOIN DimA ON DimA.Dept = FactA.Dept
Join DimB ON DimB.Area = DimA.Area
GROUP BY DimB.PO

SELECT DimB.PO, sum(FactB.Sales) As TotalSales
FROM FactB
JOIN DimB ON DimB.Code = FactB.Code
GROUP BY DimB.PO

I need to combine these two, so I can get the results with one query, and use GROUP BY and WHERE clause from DimB. I tried this:

SELECT DimB.PO,COUNT(DISTINCT FactA.Customer) As Customers, SUM(FactB.Sales) As TotalSales
FROM FactA
JOIN DimA ON DimA.Dept = FactA.Dept
Join DimB ON DimB.Area = DimA.Area
Join FactB ON FactB.Code = DimB.Code
GROUP BY DimB.PO

And it failed miserably, because it is giving way too large numbers in TotalSales. I believe it is summing way too much sales, because of the many joins. Searching around I belive subqueries might be the answer, but I am overwhelmed about which part I should be subquerying. Many thanks to anyone who has time to help.

Advertisement

Answer

I’m going to guess these are the results you want. Please always state the desired results explicitly so we don’t have to guess.

;WITH SalesByCode(Code, Sales) AS 
(
  SELECT Code, SUM(Sales) FROM dbo.FactB GROUP BY Code
)
SELECT DimB.PO, 
  Customers  = COUNT(DISTINCT FactA.Customer), 
  TotalSales = MAX(sbc.Sales)
FROM dbo.FactA
JOIN dbo.DimA ON DimA.Dept = FactA.Dept
Join dbo.DimB ON DimB.Area = DimA.Area
Join SalesByCode AS sbc ON DimB.Code = sbc.Code
GROUP BY DimB.PO;

Results:

PO      Customers    TotalSales 
----    ---------    ----------
Z1              3            55
Z3              2            70
Z5              1            70
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement