This question is a continuation of these two:
We have a table in Big Query like below.
Input table:
x
Name | Question | Answer
-----+-----------+-------
Bob | Interest | ["a"]
Sue | Interest | ["a", "b"]
Joe | Interest | ["b"]
Joe | Gender | Male
Bob | Gender | Female
Sue | DOB | 2020-10-17
Bob | Others | { "country" : "es", "language" : "ca"}
Note: All the values in the Answer column are stringified values and the Arrays / JSON objects are dynamic.
We want to convert the above table to the below format to make it BI/Visualisation friendly.
Desired table:
+-------------------------------------------------------------+
| Name | a | b | c | Gender | DOB | country | language |
+-------------------------------------------------------------+
| Bob | 1 | 0 | 0 | Female | 2020-10-17 | es | ca |
| Sue | 1 | 1 | 0 | - | - | - | - |
| Joe | 0 | 1 | 0 | Male | - | - | - |
+-------------------------------------------------------------+
Advertisement
Answer
Below is for BigQuery Standard SQL
#standardSQL
create temp table data as
select name, question, value as answer
from `project.dataset.table`,
unnest(split(translate(answer, '[]" ', ''))) value
where question = 'Interest'
union all
select name, question, answer
from `project.dataset.table`
where not question in ('Interest', 'Others')
union all
select name,
split(value, ':')[offset(0)] as question,
split(value, ':')[offset(1)] as answer
from `project.dataset.table`,
unnest(split(translate(answer, '{}" ', ''))) value
where question = 'Others';
EXECUTE IMMEDIATE (
SELECT """
SELECT name, """ || STRING_AGG("""MAX(IF(answer = '""" || value || """', 1, 0)) AS """ || value, ', ')
FROM (
SELECT DISTINCT answer value FROM data
WHERE question = 'Interest' ORDER BY value
)) || (
SELECT ", " || STRING_AGG("""MAX(IF(question = '""" || value || """', answer, '-')) AS """ || value, ', ')
FROM (
SELECT DISTINCT question value FROM data
WHERE question != 'Interest' ORDER BY value
)) || """
FROM data
GROUP BY name
""";
if to apply to sample data from your question
with `project.dataset.table` AS (
select 'Bob' name, 'Interest' question, '["a"]' answer union all
select 'Sue', 'Interest', '["a", "b"]' union all
select 'Joe', 'Interest', '["b"]' union all
select 'Joe', 'Gender', 'Male' union all
select 'Bob', 'Gender', 'Female' union all
select 'Sue', 'DOB', '2020-10-17' union all
select 'Bob', 'Others', '{ "country" : "es", "language" : "ca"}'
)
the output is
Note: EXECUTE IMMEDIATE
part of above script is exactly the same as in previous post – the change is only in preparing original data into temp table data
and than using it in EXECUTE IMMEDIATE