Redshift get Json Keys

Question

I have a table field with a value like this: { "0fc8a2a1-e334-43b8-9311-ce46da9cd32c": { "alert": "345", "channel": "ios_push", ...

Accepted Answer

Redshift is not good with JSON, and especially not good with arbitrary JSON keys (as @GMB mentioned). Also it&#8217;s not good with nested data structures.So actually, you have 2 problems:extacting the json keys. I see 2 options here:Use python UDFUse regexunnesting an array of keys into a table. There&#8217;s a trick to unnest data into rows (see CROSS JOIN with seq table in queries belowbelow) &#8211; described in this SO answer.1. Solution with python UDFyou can implement json parsing in python and register it as a user defined function https://docs.aws.amazon.com/redshift/latest/dg/udf-python-language-support.htmlfunction:create or replace function f_py_json_keys (a varchar(65535))    returns varchar(65535)    stableas $$    import json    return ",".join(json.loads(a).keys())$$ language plpythonu;query:with input(json) as (    select '{    "0fc8a2a1-e334-43b8-9311-ce46da9cd32c": {        "alert": "345",        "channel": "ios_push",        "name": "Variant 1"    },    "4344d89b-7f0d-4453-b2c5-d0d4a39d7d25": {        "channel": "ios_push",        "name": "Control Group",        "type": "control"    }}'::varchar), seq(idx) as (    select 1 UNION ALL    select 2 UNION ALL    select 3 UNION ALL    select 4 UNION ALL    select 5), input_with_occurences as (    select f_py_json_keys(json) as keys,           regexp_count(keys, ',') + 1 as number_of_occurrences    from input)select    split_part(keys, ',', idx) as idfrom input_with_occurences cross join seqwhere idx <= number_of_occurrences2. Solution with REGEX magicRedshift has some regex functions. That&#8217;s a working example that would do the job for the payload you specified:with input(json) as (    select '{    "0fc8a2a1-e334-43b8-9311-ce46da9cd32c": {        "alert": "345",        "channel": "ios_push",        "name": "Variant 1"    },    "4344d89b-7f0d-4453-b2c5-d0d4a39d7d25": {        "channel": "ios_push",        "name": "Control Group",        "type": "control"    }}'::varchar), seq(idx) as (    select 1 UNION ALL    select 2 UNION ALL    select 3 UNION ALL    select 4 UNION ALL    select 5), input_with_occurences as (    select *,           regexp_count(json,                        '\{?\"([0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12})\":\s\{[\w\s\":,]+\}') as number_of_occurrences    from input)select       REGEXP_SUBSTR(json, '\{?\"([0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12})\":\s\{[\w\s\":,]+\}', 1, idx, 'e') as id       from input_with_occurences cross join seq        where idx <= number_of_occurrencesresult looks like:+------------------------------------+|id                                  |+------------------------------------+|0fc8a2a1-e334-43b8-9311-ce46da9cd32c||4344d89b-7f0d-4453-b2c5-d0d4a39d7d25|+------------------------------------+

Advertisement

Answer

1. Solution with python UDF

2. Solution with REGEX magic