in SQL Server I have two fact tables. FactA I will use COUNT(DISTINCT) and need to use two joins to get the needed GROUP BY and WHERE data from DimB. FactB uses SUM and needs only one join to DimB. Now I can make these queries work one by one, but I am at complete lost, trying to merge them into one query.
Here is an extremely simplified example data:
CREATE TABLE FactA ([Customer] varchar(3), [Dept] varchar(5)) ; INSERT INTO FactA ([Customer], [Dept]) VALUES ('Aaa', 'DeptA'), ('Bbb', 'DeptA'), ('Ccc', 'DeptA'), ('Ddd', 'DeptB'), ('Fff', 'DeptB'), ('Aaa', 'DeptC'), ('Aaa', 'DeptC') ; CREATE TABLE DimA ([Dept] varchar(5), [Area] varchar(2)) ; INSERT INTO DimA ([Dept], [Area]) VALUES ('DeptA', 'SW'), ('DeptB', 'NW'), ('DeptC', 'SE') ; CREATE TABLE DimB ([Area] varchar(2), [Code] varchar(2), [PO] varchar(2)) ; INSERT INTO DimB ([Area], [Code], [PO]) VALUES ('SW', 'XA', 'Z1'), ('NW', 'XB', 'Z3'), ('SE', 'XC', 'Z5') ; CREATE TABLE FactB ([Sales] int, [Code] VARCHAR(2)) ; INSERT INTO FactB ([Sales], [Code]) VALUES (20, 'XA'), (10, 'XA'), (25, 'XA'), (30, 'XB'), (40, 'XB'), (15, 'XC'), (20, 'XC'), (35, 'XC') ;
Example queries that work by themselves and give expected results are:
SELECT DimB.PO,COUNT(DISTINCT FactA.Customer) As Customers FROM FactA JOIN DimA ON DimA.Dept = FactA.Dept Join DimB ON DimB.Area = DimA.Area GROUP BY DimB.PO SELECT DimB.PO, sum(FactB.Sales) As TotalSales FROM FactB JOIN DimB ON DimB.Code = FactB.Code GROUP BY DimB.PO
I need to combine these two, so I can get the results with one query, and use GROUP BY and WHERE clause from DimB. I tried this:
SELECT DimB.PO,COUNT(DISTINCT FactA.Customer) As Customers, SUM(FactB.Sales) As TotalSales FROM FactA JOIN DimA ON DimA.Dept = FactA.Dept Join DimB ON DimB.Area = DimA.Area Join FactB ON FactB.Code = DimB.Code GROUP BY DimB.PO
And it failed miserably, because it is giving way too large numbers in TotalSales. I believe it is summing way too much sales, because of the many joins. Searching around I belive subqueries might be the answer, but I am overwhelmed about which part I should be subquerying. Many thanks to anyone who has time to help.
Advertisement
Answer
I’m going to guess these are the results you want. Please always state the desired results explicitly so we don’t have to guess.
;WITH SalesByCode(Code, Sales) AS ( SELECT Code, SUM(Sales) FROM dbo.FactB GROUP BY Code ) SELECT DimB.PO, Customers = COUNT(DISTINCT FactA.Customer), TotalSales = MAX(sbc.Sales) FROM dbo.FactA JOIN dbo.DimA ON DimA.Dept = FactA.Dept Join dbo.DimB ON DimB.Area = DimA.Area Join SalesByCode AS sbc ON DimB.Code = sbc.Code GROUP BY DimB.PO;
Results:
PO Customers TotalSales ---- --------- ---------- Z1 3 55 Z3 2 70 Z5 1 70