Why does outer reference in SQL subquery produce different results?

I run two SQL queries: The first one have an outer reference to the table inside subquery. In the second one I add the same table inside subquery. The results are different, it fails due to multiple rows.

The first one runs on Oracle, but fails on Spark-SQL. Therefore I am looking for a solution similar to Oracle SQl as in the first SQL code.

Query 1:

select *, 
(select N_CODE 
from table2 f 
where f.ID1 = (select min(f.ID1) 
               from table1 a left join table2 f on a.ID2 = f.ID2
               where a.ID2 = table1.ID2 
               ) 
) AS CODE

from table1

​x
 
select *, (select N_CODE from table2 f where f.ID1 = (select min(f.ID1)                from table1 a left join table2 f on a.ID2 = f.ID2               where a.ID2 = table1.ID2                ) ) AS CODE​from table1​

Query 2:

select *, 
(select N_CODE 
from table1 t, table2 f 
where f.ID1 = (select min(f.ID1) 
               from table1 a left join table2 f on a.ID2 = f.ID2
               where a.ID2 = t.ID2 
               ) 
) AS CODE

from table1

 
select *, (select N_CODE from table1 t, table2 f where f.ID1 = (select min(f.ID1)                from table1 a left join table2 f on a.ID2 = f.ID2               where a.ID2 = t.ID2                ) ) AS CODE​from table1​

The second one is my solution to the first one in Spark SQL, but it fails on both Oracle and Spark. How can I run the first query on Spark SQL similar to Oracle?

Please do not modify the structure of the query.

Answer

Oracle supports multiple inner queries but spark does not. The best way to overcome it is to split your super query into pieces and use join them.

For instance run this part and save it as a table3:

select min(table2 .ID1)
           from table1 a left join table2 f on a.ID2 = f.ID2
           where a.ID2 = t.ID2
from table2

 
select min(table2 .ID1)           from table1 a left join table2 f on a.ID2 = f.ID2           where a.ID2 = t.ID2from table2​

Then use it for your main query:

....
where f.ID1 = table3

 
....where f.ID1 = table3​

Advertisement

Answer