For the following sql
CREATE or replace TABLE temp.t1 ( a STRING) ; insert into temp.t1 values ('val_a'); CREATE or replace TABLE temp.t2 (b STRING) ; insert into temp.t2 values ('val_b'); create or replace table `temp.a1` as select distinct b from temp.t2 ; select distinct a from `temp.t1` where a in (select distinct a from `temp.a1`) ;
Since there is no a
in temp.a1
and there should be an error here, However, the output of Bigquery is
Row a 1 val_a
Why the result happened?
On the other side, when run select distinct a from temp.a1;
there is one error Unrecognized name: a
comes up.
Advertisement
Answer
Your query is:
select distinct a from `temp.t1` where a in (select distinct a from `temp.a1`);
You think this should be:
select distinct t1.a from `temp.t1` t1 where t1.a in (select distinct a1.a from `temp.a1` a1);
And hence generate an error. However, the rules of SQL interpret this as:
select distinct t1.a from `temp.t1` t1 where t1.a in (select distinct t1.a from `temp.a1` a1);
Because the scoping rules say that if a
is not found in the subquery then look for it in the outer query.
That is the definition of SQL.
The solution? Always qualify column references. Qualify means to include the table alias in the reference.
Also note that select distinct
is meaningless in the subquery for an in
, because in
does not create duplicates. You should get rid of the distinct
in the subquery.