This is my dataframe:
authors: array (nullable = true)--> element: string (containsNull = true)
I want to select all books where the author is Udo Haiber.
spark.sql("select * from f where authors="Udo Haiber" ").show
but of course it didn’t work because authors is array.
Advertisement
Answer
You can use array_contains
to check if the author is inside the array:
spark.sql("select * from f where array_contains(authors, 'Udo Haiber')")
Use single quotes to quote the author name because you’re using double quotes for the query string.