Skip to content
Advertisement

How to deal with ambiguous column reference in sql column name reference?

I have some code:

CREATE OR REPLACE TEMPORARY VIEW cctv_join_1  
AS
    SELECT * 
    FROM cctv1_details AS A 
    LFET JOIN cctv3_details AS B ON (A.CDE_WR = B.CDE_WR) AND A.CDE_dist = B.CDE_dist

I then try

CREATE OR REPLACE TEMPORARY VIEW cctv_join_2  
AS
   (SELECT * 
    FROM cctv_join_1 AS A
    LEFT JOIN cctv5_details AS B ON (A.CDE_WR = B.CDE_WR) AND (A.CDE_dist = B.CDE_dist))

but I get an error

Error in SQL statement: AnalysisException: Reference ‘A.CDE_WR’ is ambiguous, could be: A.CDE_WR, A.CDE_WR.; line 6 pos 4

in databricks. How can I deal with this?

Advertisement

Answer

This query:

SELECT * 
FROM cctv1_details c1d LEFT JOIN
     cctv3_details c3d
     ON c1d.CDE_WR = c3d.CDE_WR AND c1d.CDE_dist = c3d.CDE_dist

is using SELECT *. The * is shorthand for all columns from both tables.

Obviously the combined columns from the two tables have duplicate column names; at least, CDE_WR and CDE_dist — and there may be others. The general solution is to list all the columns out:

SELECT c1d.col1, c1d.col2, . . . c3d.colx, c3d.coly
FROM cctv1_details c1d LEFT JOIN
     cctv3_details c3d
     ON c1d.CDE_WR = c3d.CDE_WR AND c1d.CDE_dist = c3d.CDE_dist;

However, this is often shorted to:

SELECT c1d.*, c3d.colx, c3d.coly
FROM cctv1_details c1d LEFT JOIN
     cctv3_details c3d
     ON c1d.CDE_WR = c3d.CDE_WR AND c1d.CDE_dist = c3d.CDE_dist;

Note that I changed the table aliases to be reasonable abbreviations for the table names, making the query much clearer and easier to maintain.

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement