Skip to content
Advertisement

How to cast from double to int in from_json Spark SQL (NULL output)

I have a table with a JSON string

When running this Spark SQL query:

select from_json('[{"column_1":"hola", "some_number":1.0}]', 'array<struct<column_1:string,some_number:int>>')

I get a NULL, since the data types for some_number are not matching (int vs double)…

If I run this it works:

select from_json('[{"column_1":"hola", "some_number":1.0}]', 'array<struct<column_1:string,some_number:double>>')

Is there a way to CAST this on-the-fly?

Advertisement

Answer

You can do from_json first using array<struct<column_1:string,some_number:double>> then cast as array<struct<column_1:string,some_number:int>>

Example:

spark.sql("""select cast(from_json('[{"column_1":"hola", "some_number":1.0}]', 'array<struct<column_1:string,some_number:double>>') as array<struct<column_1:string,some_number:int>>)""").show()

//+-------------------------------------------------------+
//|jsontostructs([{"column_1":"hola", "some_number":1.0}])|
//+-------------------------------------------------------+
//|                                            [[hola, 1]]|
//+-------------------------------------------------------+

//printSchema
spark.sql("""select cast(from_json('[{"column_1":"hola", "some_number":1.0}]', 'array<struct<column_1:string,some_number:double>>') as array<struct<column_1:string,some_number:int>>)""").printSchema()

//root
// |-- jsontostructs([{"column_1":"hola", "some_number":1.0}]): array (nullable = true)
// |    |-- element: struct (containsNull = true)
// |    |    |-- column_1: string (nullable = true)
// |    |    |-- some_number: integer (nullable = true)
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement