Skip to content
Advertisement

LEFT JOIN ON NULL Key Values Combined with GROUP BY

I’m using Teradata SQL and I wrote the following query (pay attention at the LEFT JOIN)

SELECT
    key1,
    secondvalue,
    count(DISTINCT firstvalue)
FROM
(
    SELECT
        t1.val AS firstvalue,
        t1.key1,
        t2.val AS secondvalue
        
    FROM table1 t1
    LEFT JOIN table2 t2 ON t1.key1 = t1.key1 AND t1.key2 = t2.key2 AND t1.key3 = t2.key3
) AS Testcase
GROUP BY 1, 2

Some entries for t1.key2 und t1.key3 (of the left sided table) are NULL. When that’s the case, the rows are not showing in the result, why? Is that Teradata specific, I would expect a LEFT JOIN to show rows with NULL values.

If I for example choose to reduce the LEFT JOIN statement to

SELECT
    key1,
    secondvalue,
    count(DISTINCT firstvalue)
FROM
(
    SELECT
        t1.val AS firstvalue,
        t1.key1,
        t2.val AS secondvalue
        
    FROM table1 t1
    LEFT JOIN table2 t2 ON t1.key1 = t1.key1
) AS Testcase
GROUP BY 1, 2

More distinct values for key1 are showing up, which shouldn’t be the case, right? I’d want to see all distinct values for key1, even when then key2 and key3 are NULL. If not, I want to see the specific value from the second table.

I really don’t want to map the NULL values to another value first.

Advertisement

Answer

Some entries for t1.key2 und t1.key3 (of the left sided table) are NULL. When that’s the case, the rows are not showing in the result, why?

Likely because group by and distinct will group all nulls together. It is true that in SQL, null does not necessarily equal null, but I believe that from a distinct and group by perspective, nulls are considered identical.

I would expect a LEFT JOIN to show rows with NULL values.

I’ve never tried a left outer join where null is on the left side. I can’t say whether the join would emit a result for such a row, but I can tell you that it wouldn’t match an equi-join because null does not equal null. Are you looking for a full outer join?

LEFT JOIN table2 t2 ON t1.key1 = t1.key1

This predicate will match every row where t1.key1 is not null, which is almost a Cartesian product. That is why your count is so much larger.

I think you want

LEFT JOIN table2 t2 ON t1.key1 = t2.key1

I’d want to see all distinct values for key1, even when then key2 and key3 are NULL

Overall, I think what you want is

LEFT JOIN table2 t2 ON t1.key1 = t2.key1 AND coalesce(t1.key2, -1) = coalesce(t2.key2, -1) AND coalesce(t1.key3, -1) = coalesce(t2.key3, -1)

You want the second argument to coalesce to be a syntactically valid value of whatever type key2 and key3 are, but be a value that is not valid in your usage of it (otherwise, we’d joining rows where we have -1 on one side and null on the other.

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement