How to execute custom logic at pyspark window partition

Question

I have a dataframe in the format shown below, where we will have multiple entries of DEPNAME as shown below, my requirement is to set the result = Y at the DEPNAME level if either flag_1 or flag_2= Y, if both the flag i.e. flag_1 and flag_2 = N the result will be set as N as shown for DEPNAME=personnel

Accepted Answer

This answers the original version of the question.This looks like a case expression:select t.*,       (case when flag_1 = 'Y' or flag_2 = 'Y'             then 'Y' else 'N'        end) as resultFor the updated version:select t.*,       max(case when flag_1 = 'Y' or flag_2 = 'Y'                then 'Y' else 'N'           end) over (partition by depname) as result

Advertisement

Answer