Skip to content

Tag: pyspark

Pyspark: cast array with nested struct to string

I have pyspark dataframe with a column named Filters: “array>” I want to save my dataframe in csv file, for that i need to cast the array to string type. I tried to cast it: DF.Filters.tostring() and DF.Filters.cast(StringType()), but both solutions generate error message for each row in the columns Filters: org.apache.spark.sql.catalyst.expressions.UnsafeArrayData@56234c19 The code is as follows Sample JSON data:

Sparksql filtering (selecting with where clause) with multiple conditions

Hi I have the following issue: All the values that I want to filter on are literal null strings and not N/A or Null values. I tried these three options: numeric_filtered = numeric.filter(numeric[‘LOW’] != ‘null’).filter(numeric[‘HIGH’] != ‘null’).filter(numeric[‘NORMAL’] != ‘null’) numeric_filtered = numeric.filter(numeric[‘LOW’] != ‘null’ AND numeric[‘HIGH’] != ‘null’ AND numeric[‘NORMAL’] != ‘null’) sqlContext.sql(“SELECT * from numeric WHERE LOW != ‘null’