Skip to content
Advertisement

How I can select a column where in another column I need a specific things

I have a pyspark data frame. How I can select a column where in another column I need a specific things. suppose I have n columns. for 2 columns I have

A.  B.
a   b 
a   c
d   f

I want all column B. where column A is a. so

A.  B.
a   b 
a   c
 

Advertisement

Answer

It’s just a simple filter:

df2 = df.filter("A = 'a'")

which comes in many flavours, such as

df2 = df.filter(df.A == 'a')
df2 = df.filter(df['A'] == 'a')

or

import pyspark.sql.functions as F
df2 = df.filter(F.col('A') == 'a')
Advertisement