Hive DBMS; Two tables — A and B
Table A
x
prnt_id sub_id ac_nm cost units
unknown abc01 abc corp 34500 24
unknown unknown xyz corp 9800 10
856 abc03 jfk corp 9820 12
Table B
prnt_id sub_id ac_nm
123 abc01 abc corp
456 abc02 xyz corp
856 abc03 jfk corp
859 abc04 ops corp
Question –> Trying to execute a query where: Join table A with table B, first on prnt_id, if it’s “unknown”, then join on sub_id, if that is “unknown”, join on ac_nm
Desired Output:
prnt_id sub_id ac_nm cost units
123 abc01 abc corp 34500 24
456 abc02 xyz corp 9800 10
856 abc03 jfk corp 9820 12
Advertisement
Answer
You must use LEFT
joins of TableB
to 3 copies of TableA
and filter out the non matching rows:
select b.*,
coalesce(a1.cost, a2.cost, a3.cost) cost,
coalesce(a1.units, a2.units, a3.units) units
from TableB b
left join TableA a1 on a1.prnt_id = b.prnt_id
left join TableA a2 on a2.sub_id = b.sub_id and a1.prnt_id is null
left join TableA a3 on a3.ac_nm = b.ac_nm and a2.sub_id is null
where coalesce(a1.prnt_id, a2.sub_id, a3.ac_nm) is not null
order by b.prnt_id
See the demo.
Results:
| prnt_id | sub_id | ac_nm | cost | units |
| ------- | ------ | -------- | ----- | ----- |
| 123 | abc01 | abc corp | 34500 | 24 |
| 456 | abc02 | xyz corp | 9800 | 10 |
| 856 | abc03 | jfk corp | 9820 | 12 |