Skip to content
Advertisement

How to remove Repeated field in BigQuery schema?

I have a schema that has a repeated field nested into another repeated field like so: person.children.toys. I want to make this inner field not repeated (so child can have only single nullable toy). I know that for such change I need to make a new table with new schema and run SQL query that inserts modified results into it, but I don’t know how to make the query. I need it to select first toy (or null) for each child and insert resulting objects into new table. There is a guarantee that in source table all children have no more than 1 toy.

Advertisement

Answer

Below is for BigQuery Standard SQL

I know – it might look over-complicated – but it totally preserves original schema while eliminating all but first (or null) toys. This can be handy if your real schema has more than just few fields so you don’t need to worry about them

If to apply to below data

result will be

Note: toys field originally REPEATED STRING becomes just STRING

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement