I have a pyspark data frame. How I can select a column where in another column I need a specific things. suppose I have n columns. for 2 columns I have
A. B. a b a c d f
I want all column B. where column A
is a
. so
A. B. a b a c
Advertisement
Answer
It’s just a simple filter
:
df2 = df.filter("A = 'a'")
which comes in many flavours, such as
df2 = df.filter(df.A == 'a') df2 = df.filter(df['A'] == 'a')
or
import pyspark.sql.functions as F df2 = df.filter(F.col('A') == 'a')