Skip to content
Advertisement

HIVE Query: How to select the ‘join key’ for “FULL OUTER JOIN”?

I got the following error:

Multiple attributes named business_id in reportAttributeReferenceAmbiguous; p.business_id, s.business_id

from the SPARK/HIVE query:

SELECT
    business_id,
    p.p1,
    p.p2,
    s.s1,
    s.s2
FROM p_table p
FULL OUTER JOIN s_table s
    ON p.business_id = s.business_id

Since I am using FULL OUTER JOIN, business_id could be either p.business_id or s.business_id, so I didn’t specify which one to use. (I thought some rows might have only p.business_id while some other rows might have only s.business_id, do I understand this wrong?)

So I am wondering what’s the proper way to select business_id in the above query? Thanks!

Advertisement

Answer

Use

coalesce(p.business_id, s.business_id) as business_id

Or

nvl(p.business_id, s.business_id) as business_id

Or CASE statements like in this answer: https://stackoverflow.com/a/37744071/2700344

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement