What is the semantic difference between the following queries? To me both seemed similar until I executed them on presto. Is this something specific to presto or am I missing something in the SQL standard?
Form 1: All conditions specified in the ON clause.
SELECT t1.colA, t1.colB, t1.colC, t2.colD FROM t1 LEFT OUTER JOIN t2 ON t1.colA = t2.colA AND t1.colB = t2.colB AND t1.colE = 1 AND t2.colF = 2;
Form 2: Some conditions specified in the WHERE clause instead.
SELECT t1.colA, t1.colB, t1.colC, t2.colD FROM t1 LEFT OUTER JOIN t2 ON t1.colA = t2.colA AND t1.colB = t2.colB WHERE t1.colE = 1 AND t2.colF = 2;
Form 1 results in some rows but form 2 doesn’t but shouldn’t they be equivalent?
Advertisement
Answer
There are two important differences.
First the condition t1.colE = 1
. A LEFT JOIN
keeps all rows in the first table, regardless of what the ON
clause evaluates to. So, t1.colE
does not change the number of rows in the result set. However, it does have the strange effect that any columns from t2
when this condition is not true are NULL
.
Second the condition t2.colF = 2
has a different effect. This turns the LEFT JOIN
into an INNER JOIN
, because NULL
values do not match the WHERE
clause.