I have a table field with a value like this:
x
{
"0fc8a2a1-e334-43b8-9311-ce46da9cd32c": {
"alert": "345",
"channel": "ios_push",
"name": "Variant 1"
},
"4344d89b-7f0d-4453-b2c5-d0d4a39d7d25": {
"channel": "ios_push",
"name": "Control Group",
"type": "control"
}
}
I want to know if there is some way to get the “0fc8a2a1-e334-43b8-9311-ce46da9cd32c” and “4344d89b-7f0d-4453-b2c5-d0d4a39d7d25” values.
Advertisement
Answer
Redshift is not good with JSON, and especially not good with arbitrary JSON keys (as @GMB mentioned). Also it’s not good with nested data structures.
So actually, you have 2 problems:
- extacting the json keys. I see 2 options here:
- Use python UDF
- Use regex
- unnesting an array of keys into a table. There’s a trick to unnest data into rows (see
CROSS JOIN
withseq
table in queries belowbelow) – described in this SO answer.
1. Solution with python UDF
you can implement json parsing in python and register it as a user defined function https://docs.aws.amazon.com/redshift/latest/dg/udf-python-language-support.html
function:
create or replace function f_py_json_keys (a varchar(65535))
returns varchar(65535)
stable
as $$
import json
return ",".join(json.loads(a).keys())
$$ language plpythonu;
query:
with input(json) as (
select '{
"0fc8a2a1-e334-43b8-9311-ce46da9cd32c": {
"alert": "345",
"channel": "ios_push",
"name": "Variant 1"
},
"4344d89b-7f0d-4453-b2c5-d0d4a39d7d25": {
"channel": "ios_push",
"name": "Control Group",
"type": "control"
}
}'::varchar
), seq(idx) as (
select 1 UNION ALL
select 2 UNION ALL
select 3 UNION ALL
select 4 UNION ALL
select 5
), input_with_occurences as (
select f_py_json_keys(json) as keys,
regexp_count(keys, ',') + 1 as number_of_occurrences
from input
)
select
split_part(keys, ',', idx) as id
from input_with_occurences cross join seq
where idx <= number_of_occurrences
2. Solution with REGEX magic
Redshift has some regex functions. That’s a working example that would do the job for the payload you specified:
with input(json) as (
select '{
"0fc8a2a1-e334-43b8-9311-ce46da9cd32c": {
"alert": "345",
"channel": "ios_push",
"name": "Variant 1"
},
"4344d89b-7f0d-4453-b2c5-d0d4a39d7d25": {
"channel": "ios_push",
"name": "Control Group",
"type": "control"
}
}'::varchar
), seq(idx) as (
select 1 UNION ALL
select 2 UNION ALL
select 3 UNION ALL
select 4 UNION ALL
select 5
), input_with_occurences as (
select *,
regexp_count(json,
'\{?\"([0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12})\":\s\{[\w\s\":,]+\}') as number_of_occurrences
from input
)
select
REGEXP_SUBSTR(json, '\{?\"([0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12})\":\s\{[\w\s\":,]+\}', 1, idx, 'e') as id
from input_with_occurences cross join seq
where idx <= number_of_occurrences
result looks like:
+------------------------------------+
|id |
+------------------------------------+
|0fc8a2a1-e334-43b8-9311-ce46da9cd32c|
|4344d89b-7f0d-4453-b2c5-d0d4a39d7d25|
+------------------------------------+