Skip to content
Advertisement

Translate Oracle query into pandas dataframe handling

I have the below dataframe:

PARAM1 PARAM2 VALUE
A X TUE, WED
A Y NO
B X MON, WED
B Y YES

I would like a pythonic way of obtaining the distinct values of param1 that satisfy EITHER of these conditions:

  1. Their corresponding param2 = ‘X’ contains the string ‘MON’
  2. Their corresponding param2 = ‘Y’ is equal to ‘YES’.

In the example above, the output would be just B, because.

PARAM1 PARAM2 VALUE EXPLANATION
A X TUE, WED X parameter does not contain ‘MON’, so does not count for A.
A Y NO Y parameter is not equal to ‘YES’, so does not count for A.
B X MON, WED X parameter contains ‘MON’, so it counts for B.
B Y YES Y parameter is equal to ‘YES’, so it counts for B.

Since A has not met either of the criteria for param2 X and Y, it’s not in the output. B has fulfilled both (would have been enough with just one), so it’s in the output.

In Oracle I would do it this way, but not sure how to proceed in python:

Advertisement

Answer

First, we form a boolean mask based on the condition, then select the corresponding rows from the dataframe:

Prints:

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement