I have a table field with a value like this:
{ "0fc8a2a1-e334-43b8-9311-ce46da9cd32c": { "alert": "345", "channel": "ios_push", "name": "Variant 1" }, "4344d89b-7f0d-4453-b2c5-d0d4a39d7d25": { "channel": "ios_push", "name": "Control Group", "type": "control" } }
I want to know if there is some way to get the “0fc8a2a1-e334-43b8-9311-ce46da9cd32c” and “4344d89b-7f0d-4453-b2c5-d0d4a39d7d25” values.
Advertisement
Answer
Redshift is not good with JSON, and especially not good with arbitrary JSON keys (as @GMB mentioned). Also it’s not good with nested data structures.
So actually, you have 2 problems:
- extacting the json keys. I see 2 options here:
- Use python UDF
- Use regex
- unnesting an array of keys into a table. There’s a trick to unnest data into rows (see
CROSS JOIN
withseq
table in queries belowbelow) – described in this SO answer.
1. Solution with python UDF
you can implement json parsing in python and register it as a user defined function https://docs.aws.amazon.com/redshift/latest/dg/udf-python-language-support.html
function:
create or replace function f_py_json_keys (a varchar(65535)) returns varchar(65535) stable as $$ import json return ",".join(json.loads(a).keys()) $$ language plpythonu;
query:
with input(json) as ( select '{ "0fc8a2a1-e334-43b8-9311-ce46da9cd32c": { "alert": "345", "channel": "ios_push", "name": "Variant 1" }, "4344d89b-7f0d-4453-b2c5-d0d4a39d7d25": { "channel": "ios_push", "name": "Control Group", "type": "control" } }'::varchar ), seq(idx) as ( select 1 UNION ALL select 2 UNION ALL select 3 UNION ALL select 4 UNION ALL select 5 ), input_with_occurences as ( select f_py_json_keys(json) as keys, regexp_count(keys, ',') + 1 as number_of_occurrences from input ) select split_part(keys, ',', idx) as id from input_with_occurences cross join seq where idx <= number_of_occurrences
2. Solution with REGEX magic
Redshift has some regex functions. That’s a working example that would do the job for the payload you specified:
with input(json) as ( select '{ "0fc8a2a1-e334-43b8-9311-ce46da9cd32c": { "alert": "345", "channel": "ios_push", "name": "Variant 1" }, "4344d89b-7f0d-4453-b2c5-d0d4a39d7d25": { "channel": "ios_push", "name": "Control Group", "type": "control" } }'::varchar ), seq(idx) as ( select 1 UNION ALL select 2 UNION ALL select 3 UNION ALL select 4 UNION ALL select 5 ), input_with_occurences as ( select *, regexp_count(json, '\{?\"([0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12})\":\s\{[\w\s\":,]+\}') as number_of_occurrences from input ) select REGEXP_SUBSTR(json, '\{?\"([0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12})\":\s\{[\w\s\":,]+\}', 1, idx, 'e') as id from input_with_occurences cross join seq where idx <= number_of_occurrences
result looks like:
+------------------------------------+ |id | +------------------------------------+ |0fc8a2a1-e334-43b8-9311-ce46da9cd32c| |4344d89b-7f0d-4453-b2c5-d0d4a39d7d25| +------------------------------------+