Assuming I have rows in Athena and a column in each row may be empty, or contain json with key value pairs, I am trying to select the key value pairs as rows using UNNEST
, but enable me to select where value
or variable
is null in the following example.
The problem I’m facing is that the key of a map cannot be null. I was wondering if I could perhaps somehow convert the map(varchar, varchar)
to a tuple [varchar, varchar]
so that it’s not a key.
Any suggestions would be much appreciated.
Notes
- I ideally need to select * rather than listing them all manually – which perhaps is my issue? maybe I could do a case statement if i was willing to list all of the columns manually?
- The coalesce is so that the cross join still includes rows which don’t have a
ctx
column (left join unnest doesn’t exist in athena, only presto >=319) - I need to be able to write
where variable is not null
style queries on the result set because of the visualisation software which will consume this
Data
event_id | ctx |
---|---|
1 | {“hello”: “world”, “bar”: “baz”} |
2 |
Expected Result
event_id | variable | value |
---|---|---|
1 | hello | world |
1 | bar | baz |
2 | null | null |
Query
SELECT * from events CROSS JOIN UNNEST( COALESCE( TRY_CAST( JSON_PARSE(ctx) as MAP(varchar, varchar) ), map(array[''], array[null]) ) ) AS t (variable, value)
Advertisement
Answer
You can unnest from arrays also, so transforming map into two arrays (of keys and of values) and coalescing each one with array of single null value should do the trick:
WITH events (event_id, ctx) AS ( VALUES (1, '{"hello": "world", "bar": "baz"}'), (2, null) ) SELECT event_id, t.* FROM ( SELECT event_id, TRY_CAST(JSON_PARSE(ctx) as MAP(varchar, varchar)) as m FROM events ) CROSS JOIN UNNEST( coalesce(map_keys(m), array [ null ]), coalesce(map_values(m), array [ null ]) ) as t (variable, value)
Output:
event_id | variable | value |
---|---|---|
1 | bar | baz |
1 | hello | world |
2 |