I have 3 tables I’m trying to join. Let’s call them TableA, TableB, and TableC:
DECLARE @TableA TABLE ( Key1 int, PRIMARY KEY ( Key1 ) ) DECLARE @TableB TABLE ( Key1 int, Key2 int, PRIMARY KEY ( Key1, Key2 ) ) DECLARE @TableC TABLE ( Key3 int NOT NULL, Key1 int NOT NULL, Key2 int NULL, PRIMARY KEY ( Key3 ) )
Here’s some sample data:
INSERT INTO @TableA (Key1) VALUES (1); INSERT INTO @TableB (Key1, Key2) VALUES (1, 1), (1, 2), (1, 3), (1, 4) INSERT INTO @TableC (Key3, Key1, Key2) VALUES (1, 1, NULL), (2, 1, NULL), (3, 1, 1), (4, 1, 3)
TableB and TableC both have foreign keys to TableA via Key1. In practice, TableC can also refer to TableB via Key1 and Key2 combined, if Key2 is not null, but there’s no actual foreign key. Key3 is irrelevant, other than the fact that Key1 and Key2 are not part of the primary key of TableC.
I’m trying to write a query which combines TableB and TableC:
SELECT TableA.Key1 AS [A1], TableB.Key1 AS [B1], TableB.Key2 AS [B2], TableC.Key1 AS [C1], TableC.Key2 AS [C2], TableC.Key3 AS [C3] FROM @TableA AS TableA FULL OUTER JOIN @TableC AS TableC ON TableC.Key1 = TableA.Key1 FULL OUTER JOIN @TableB AS TableB ON (TableB.Key1 = TableA.Key1 AND TableC.Key1 IS NULL) OR (TableC.Key1 = TableB.Key1 AND TableC.Key2 = TableB.Key2) WHERE (TableA.Key1 = TableB.Key1 OR TableA.Key1 = TableC.Key1) ORDER BY TableB.Key2, TableC.Key2
My expectation is that TableB and TableC should both include all their rows, matching up the ones that match on both keys, and NULLS where they don’t match.
I expect to get this:
A1 B1 B2 C1 C2 C3 1 NULL NULL 1 NULL 1 1 NULL NULL 1 NULL 2 1 1 1 1 1 3 1 1 2 NULL NULL NULL -- THIS ROW IS MISSING 1 1 3 1 3 4 1 1 4 NULL NULL NULL -- THIS ROW IS MISSING
But instead I get this:
A1 B1 B2 C1 C2 C3 1 NULL NULL 1 NULL 1 1 NULL NULL 1 NULL 2 1 1 1 1 1 3 1 1 3 1 3 4
If I comment out the WHERE clause, I get all the rows I expect, except A1 is NULL for the missing rows:
A1 B1 B2 C1 C2 C3 1 NULL NULL 1 NULL 1 1 NULL NULL 1 NULL 2 1 1 1 1 1 3 NULL 1 2 NULL NULL NULL -- A1 should be 1 1 1 3 1 3 4 NULL 1 4 NULL NULL NULL -- A1 should be 1
Why is TableA.Key1 coming back NULL and causing it to exclude rows where TableB.Key2 is missing?
EDIT:
Here’s the final fixed query after I learned what I was doing wrong:
SELECT TableA.Key1 AS A1, Subquery.* FROM @TableA AS TableA INNER JOIN ( SELECT TableB.Key1 AS [B1], TableB.Key2 AS [B2], TableC.Key1 AS [C1], TableC.Key2 AS [C2], TableC.Key3 AS [C3] FROM @TableC AS TableC FULL OUTER JOIN @TableB AS TableB ON TableB.Key1 = TableC.Key1 AND TableB.Key2 = TableC.Key2 ) AS Subquery ON Subquery.B1 = TableA.Key1 OR Subquery.C1 = TableA.Key1 ORDER BY Subquery.B2, Subquery.C2
Advertisement
Answer
Why is TableA.Key1 coming back NULL and causing it to exclude rows where TableB.Key2 is missing?
A full outer join is the same as an INNER JOIN
but any unmatched rows from either side are added back in with NULL
for the columns from the other side.
Your query does full outer join of A
and C
first so start by looking at the result of that.
SELECT TableA.Key1 AS [A1], TableC.Key1 AS [C1], TableC.Key2 AS [C2], TableC.Key3 AS [C3] FROM @TableA AS TableA FULL OUTER JOIN @TableC AS TableC ON TableC.Key1 = TableA.Key1
This returns the following virtual table (VT1) going on to the next stage. As this is the same result as INNER JOIN
I doubt it needs any explanation. Every row in @TableC
successfully matched the single row in @TableA
.
+----+----+------+----+ | A1 | C1 | C2 | C3 | +----+----+------+----+ | 1 | 1 | NULL | 1 | | 1 | 1 | NULL | 2 | | 1 | 1 | 1 | 3 | | 1 | 1 | 3 | 4 | +----+----+------+----+
This is then full outer joined onto B
. The contents of B
are
+------+------+ | Key1 | Key2 | +------+------+ | 1 | 1 | | 1 | 2 | | 1 | 3 | | 1 | 4 | +------+------+
The INNER JOIN
of those two result sets with predicate ON (TableB.Key1 = [A1] AND [C1] IS NULL) OR ([C1] = TableB.Key1 AND [C2] = TableB.Key2)
returns only 2 rows.
+----+----+----+----+----+----+ | A1 | B1 | B2 | C1 | C2 | C3 | +----+----+----+----+----+----+ | 1 | 1 | 1 | 1 | 1 | 3 | | 1 | 1 | 3 | 1 | 3 | 4 | +----+----+----+----+----+----+
Unmatched rows from VT1
are added back in as per LEFT JOIN
(these are the ones where C3
is 1
or 2
)
+----+------+------+----+------+----+ | A1 | B1 | B2 | C1 | C2 | C3 | +----+------+------+----+------+----+ | 1 | NULL | NULL | 1 | NULL | 1 | | 1 | NULL | NULL | 1 | NULL | 2 | | 1 | 1 | 1 | 1 | 1 | 3 | | 1 | 1 | 3 | 1 | 3 | 4 | +----+------+------+----+------+----+
and unmatched rows from B
as per RIGHT JOIN
(these were the ones where B2
is 2
or 4
)
giving you the final result
+------+------+------+------+------+------+ | A1 | B1 | B2 | C1 | C2 | C3 | +------+------+------+------+------+------+ | 1 | NULL | NULL | 1 | NULL | 1 | | 1 | NULL | NULL | 1 | NULL | 2 | | 1 | 1 | 1 | 1 | 1 | 3 | | 1 | 1 | 3 | 1 | 3 | 4 | | NULL | 1 | 2 | NULL | NULL | NULL | | NULL | 1 | 4 | NULL | NULL | NULL | +------+------+------+------+------+------+