BigQuery SQL: do SELECT aliases have priority over FROM clause column names when used in GROUP BY clause?

I have a simple query that has the ambiguity regarding col:

SELECT SUBSTR(col, 1, 4) AS col, 
       COUNT(*) AS nc
  FROM (SELECT 'test1' AS col
         UNION ALL
        SELECT 'test2' AS col) 
 GROUP BY col

​x
 
SELECT SUBSTR(col, 1, 4) AS col,        COUNT(*) AS nc  FROM (SELECT 'test1' AS col         UNION ALL        SELECT 'test2' AS col)  GROUP BY col​

According to the documentation, the query validator should raise an error:

Ambiguity between a FROM clause column name and a SELECT list alias in GROUP BY:
SELECT UPPER(LastName) AS LastName
FROM Singers
GROUP BY LastName;
SELECT UPPER(LastName) AS LastName
FROM Singers
GROUP BY LastName;
The query above is ambiguous and will produce an error because LastName in the GROUP BY clause could refer to the original column LastName in Singers, or it could refer to the alias AS LastName, whose value is UPPER(LastName).

But the query executes successfully and provides the following result:

╔═══╦═══════╦══════╗
║   ║ col   ║ nc   ║
╠═══╬═══════╬══════╣
║ 1 ║ test  ║ 2    ║
╚═══╩═══════╩══════╝

 
╔═══╦═══════╦══════╗║   ║ col   ║ nc   ║╠═══╬═══════╬══════╣║ 1 ║ test  ║ 2    ║╚═══╩═══════╩══════╝​

It’s clear from the results that GROUP BY clause recognizes col as an alias from the outer select (whole SUBSTR expression).

So here is my question, is it reliable enough behavior? Can I write my queries taking this into account and adapting my queries correspondingly?

Answer

That query is not really ambiguous. I understand the intent well, and BigQuery too.

Compare:

WITH Singers AS (SELECT "Sinatra" LastName UNION ALL SELECT "sinatra")

SELECT UPPER(LastName) AS LastName
FROM Singers
GROUP BY Singers.LastName;

 
WITH Singers AS (SELECT "Sinatra" LastName UNION ALL SELECT "sinatra")​SELECT UPPER(LastName) AS LastNameFROM SingersGROUP BY Singers.LastName;​

with

WITH Singers AS (SELECT "Sinatra" LastName UNION ALL SELECT "sinatra")

SELECT UPPER(LastName) AS LastName
FROM Singers
GROUP BY LastName;

 
WITH Singers AS (SELECT "Sinatra" LastName UNION ALL SELECT "sinatra")​SELECT UPPER(LastName) AS LastNameFROM SingersGROUP BY LastName;​

Both behave as I expected. If anything, the documentation is wrong here, and I’ve filled a ticket so it gets fixed.

Advertisement

Answer