In my table I’ve a column, which is a varchar but has a nested dictionary-like format (three nested levels). Some entries have multiple key-value pairs (customer ID & name), while some just have a single entry (customer ID). For example:
column { "customer_type1": { "location1": {"customerid":"12345","name":"John"}, "location2": {"customerid":"12346","name":"Conor"}, } "customer_type2": { "location3": {"customerid":"12347","name":"Brian"}, "location4": {"customerid":"12348"}, } }
I need a query that will break out the the column into a table like this:
customer_type Location Customer_ID Name customer_type1 location1 12345 John customer_type1 location2 12346 Conor customer_type2 location3 12347 Brian customer_type2 location4 12348
I know a solution to extract for a single nested key-value pair, but cant edit it to work for a nested dictionary like this. I’m using Prestosql.
--query for single nested key-value pair select json_extract_scalar(json_parse(column), '$.customer_id') customerid, json_extract_scalar(json_parse(column), '$.name') name from dataset
Advertisement
Answer
Common approach with this kind of dynamic json is to cast it to MAP
, in this case nested map – MAP(VARCHAR, MAP(VARCHAR, JSON))
and use unnest
to flatten the result:
-- sample data WITH dataset (json_str) AS ( VALUES ( '{ "customer_type1": { "location1": {"customerid":"12345","name":"John"}, "location2": {"customerid":"12346","name":"Conor"} }, "customer_type2": { "location3": {"customerid":"12347","name":"Brian"}, "location4": {"customerid":"12348"} } }' ) ) -- query select customer_type, location, json_extract_scalar(cust_json, '$.customerid') customer_id, json_extract_scalar(cust_json, '$.name') name from ( select cast(json_parse(json_str) as map(varchar, map(varchar, json))) as maps from dataset ) cross join unnest(maps) as t(customer_type, location_map) cross join unnest(location_map) as t(location, cust_json)
Output:
customer_type | location | customer_id | name |
---|---|---|---|
customer_type1 | location1 | 12345 | John |
customer_type1 | location2 | 12346 | Conor |
customer_type2 | location3 | 12347 | Brian |
customer_type2 | location4 | 12348 |