I have a table with a JSON string
When running this Spark SQL query:
select from_json('[{"column_1":"hola", "some_number":1.0}]', 'array<struct<column_1:string,some_number:int>>')
I get a NULL
, since the data types for some_number
are not matching (int
vs double
)…
If I run this it works:
select from_json('[{"column_1":"hola", "some_number":1.0}]', 'array<struct<column_1:string,some_number:double>>')
Is there a way to CAST this on-the-fly?
Advertisement
Answer
You can do from_json
first using array<struct<column_1:string,some_number:double>>
then cast
as
array<struct<column_1:string,some_number:int>>
Example:
spark.sql("""select cast(from_json('[{"column_1":"hola", "some_number":1.0}]', 'array<struct<column_1:string,some_number:double>>') as array<struct<column_1:string,some_number:int>>)""").show() //+-------------------------------------------------------+ //|jsontostructs([{"column_1":"hola", "some_number":1.0}])| //+-------------------------------------------------------+ //| [[hola, 1]]| //+-------------------------------------------------------+ //printSchema spark.sql("""select cast(from_json('[{"column_1":"hola", "some_number":1.0}]', 'array<struct<column_1:string,some_number:double>>') as array<struct<column_1:string,some_number:int>>)""").printSchema() //root // |-- jsontostructs([{"column_1":"hola", "some_number":1.0}]): array (nullable = true) // | |-- element: struct (containsNull = true) // | | |-- column_1: string (nullable = true) // | | |-- some_number: integer (nullable = true)