Skip to content
Advertisement

Bigquery Full Join ON multiple conditions

I want to perform a Full Outer join on two tables with multiple conditions to result all the matching records along with unmatched records from both tables. Tbl1 is a bigger table with 21 M records and Tbl2 has 5k rows, like the example query below. But the outer join cannot be performed with OR conditions because of the error ‘FULL OUTER JOIN cannot be used without a condition that is an equality of fields from both sides of the join’. Is writing separate queries, then use COALESCE the only solution in this case? I am not sure how to implement this solution. Looking for any help to get this correct.

SELECT  
* 
FROM 
`myproject.table1` as t1
Full Outer JOIN
`myproject.table2` as t2
ON
(
t1.Camp1ID = t2.ID
OR t1.Camp2ID = t2.ID
OR t1.Camp3ID = t2.ID
OR t1.Camp4ID = t2.ID
OR t1.Camp5ID h = t2.ID
OR t1.Camp6ID = t2.ID
OR t1.Camp7ID = t2.ID
OR t1.Camp8ID = t2.ID
OR t1.Camp9ID = t2.ID
OR t1.Camp10ID = t2.ID
OR t1.Camp11ID = t2.ID
OR t1.Camp12ID = t2.ID
OR t1.Camp13ID = t2.ID
OR t1.Camp14ID = t2.ID
OR t1.Camp15ID = t2.ID
OR t1.Camp16ID = t2.ID
)
Where
t1.Date BETWEEN  PARSE_DATE('%m/%d/%y', t2.StartDate) AND  PARSE_DATE('%m/%d/%y', t2.EndDate)

Example code –

Tbl1:

EmpNo   EmpITPrj    EmpFinPrj   EmpHRPrj    EmpIntPrj           Date
1           IT101       null            null         null           2019-09-01
2            null        Fin101         null         null            2001-06-05
3            null        null           HR101    null           2005-11-25
4           null        null            null         Int501     2010-10-15
5           null        null            null         Int105     2019-01-10

Tbl2:

PrjID   PrjStartDate    PrjEndDate
Fin101  06/01/2005      08/14/2005
IT102   07/11/2006      10/30/2006
Int105   09/15/2019      10/01/2019

EmpNo   EmpITPrj    EmpFinPrj   EmpHRPrj    EmpIntPrj   Date            PrjID    PrjStartDate           PrjEndDate
1           IT101       null            null            null         2019-09-01     null            null                null        
2            null        Fin101         null            null         2001-06-05     Fin101          06/01/2005          08/14/2005
3           null        null            HR101        null        2005-11-25     null            null                null
4           null        null            null            Int501   2010-10-15     null            null                null
5           null        null            null            Int105   2019-01-10     Int105          09/15/2019          10/01/2019
null        null        null        null        null         null           IT102       07/11/2006          10/30/2006

Advertisement

Answer

Just move your conditions from ON to WHERE and optionally optimize all those ORs as in below

t2.ID IN (t1.Camp1ID,t1.Camp2ID,t1.Camp3ID,t1.Camp4ID,t1.Camp5ID,t1.Camp6ID,t1.Camp7ID,t1.Camp8ID,t1.Camp9ID,t1.Camp10ID,t1.Camp11ID,t1.Camp12ID,t1.Camp13ID,t1.Camp14ID,t1.Camp15ID,t1.Camp16ID)  

so you final query can look like below

SELECT * 
FROM `myproject.table1` as t1
Full Outer JOIN `myproject.table2` as t2
ON TRUE
Where t1.Date BETWEEN  PARSE_DATE('%m/%d/%y', t2.StartDate) AND  PARSE_DATE('%m/%d/%y', t2.EndDate)
And t2.ID IN (t1.Camp1ID,t1.Camp2ID,t1.Camp3ID,t1.Camp4ID,t1.Camp5ID,t1.Camp6ID,t1.Camp7ID,t1.Camp8ID,t1.Camp9ID,t1.Camp10ID,t1.Camp11ID,t1.Camp12ID,t1.Camp13ID,t1.Camp14ID,t1.Camp15ID,t1.Camp16ID) 
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement