I’m trying to filter out a hive database df
project schedule timestamp p1 1 t1 p1 2 t1 p1 3 t1 p2 1 t2 p2 2 t2
I want to replace rows in this data in such a way that the resultant dataset is something like :
project schedule timestamp p1 2 t1 p1 3 t1 p2 1 t2 p2 2 t2
The query i was trying to use was :
Insert overwrite table df Select * from df where project != p1 and schedule != 1.
This is not working as i filter out the all rows of project p1. I kinda think I’m missing something very trivial here.
Advertisement
Answer
I think the logic you want is:
where project <> 'p1' or schedule <> 1
Or equivalently:
where not (project = 'p1' and schedule = 1)