Skip to content
Advertisement

Is it right situation to do left outer join?

I have two datatset, A and B.(A is a big dataset like personal info and B is like a small summarized demographic statistic data) I want to append summary info to A data with demographic info.

As one single row of B can matches with multiple row of A, is it right to send a hive ql like A left outer join B?

For example, A is

and B is

Advertisement

Answer

From what you describe a join is appropriate. If all rows in a have a matching row in b, then an inner join is sufficient:

You would use left join when a has rows that do not match in b and you want them in the result set.

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement