How to remove Repeated field in BigQuery schema?

Question

I have a schema that has a repeated field nested into another repeated field like so: person.children.toys. I want to make this inner field not repeated (so child can have only single nullable toy). I &#8230;

Accepted Answer

Below is for BigQuery Standard SQL    I know  &#8211; it might look over-complicated  &#8211; but it totally preserves original schema while eliminating all but first (or null) toys. This can be handy if your real schema has more than just few fields so you don&#8217;t need to worry about them    #standardSQLWITH `project.dataset.table` AS (  SELECT 1 id, STRUCT([STRUCT('mike' AS name, ['woody'] AS toys)] AS children) AS person    UNION ALL  SELECT 2 id, STRUCT([STRUCT('nik', ['buzz', 'bobeep']), ('john', ['car', 'buzz', 'bobeep'])] AS children) AS person   UNION ALL  SELECT 3 id, STRUCT([STRUCT('vincent', IF(TRUE,[],['']))] AS children) AS person)SELECT *   REPLACE(    (SELECT AS STRUCT *         REPLACE (          (SELECT ARRAY_AGG(t) FROM          (SELECT * REPLACE((SELECT toy FROM UNNEST(toys) toy WITH OFFSET ORDER BY OFFSET LIMIT 1) AS toys) FROM UNNEST(children)) t)          AS children)      FROM UNNEST([person]))  AS person)FROM `project.dataset.table`     If to apply to below data  Row id  person.children.name    person.children.toys     1   1   mike                    toy1     2   2   nik                     toy2                                     toy3             john                    toy4                                     toy5                                     toy6     3   3   vincent                result will be    Row id  person.children.name    person.children.toys     1   1   mike                    toy1     2   2   nik                     toy2             john                    toy4     3   3   vincent                 null       Note: toys field originally REPEATED STRING becomes just STRING

Advertisement

Answer