I have a Table1
containing some blacklisted names. Now suppose I receive a record "def"
. The hive query should check if "def"
is present in Table1
or not. If not the name_status
should be set to blacklisted
otherwise null
. The name "def"
will be inserted in both cases. The problem I am facing is that in hive we cannot use subquery in from
clause only.
Table1
x
----blacklisted_names------
"abc"
"xyz"
---------------------------
Table 2 (before receiveing "def"
)
---name--|--name_status-----
|
"abc" | blacklisted
|
|
----------------------------
Table 2 (after receiveing "def"
)
---name--|--name_status-----
|
"abc" | blacklisted
|
"def" | null
----------------------------
Table 2 (after receiveing "xyz"
)
---name--|--name_status-----
|
"abc" | blacklisted
|
"def" | null
|
"xyz" | blaclisted
----------------------------
Advertisement
Answer
Join with blacklist and use CASE to assign ‘blaclisted’ for joined rows
insert into table2
select n.name,
case when b.name is not null then 'blacklisted' end as name_status
from new_rows n
left join table1 b --blacklisted
on n.name=b.name