I think this is a more complex version of the question in Update values in struct arrays in BigQuery.
I’m trying to update some of the fields in a struct, where the struct is heavily nested. I’m having trouble creating the SQL to do it. Here’s my table schema:
CREATE TABLE `my_dataset.test_data_for_so` ( date DATE, hits ARRAY<STRUCT<search STRUCT<query STRING, other_column STRING>, metadata ARRAY<STRUCT<key STRING, value STRING>>>> );
This is what the schema looks like in the BigQuery GUI after I create the table:
Here’s the data I’ve inserted:
INSERT INTO `my_dataset.test_data_for_so` (date, hits) VALUES ( CAST('2021-01-01' AS date), [ STRUCT( STRUCT<query STRING, other_column STRING>('foo bar', 'foo bar'), [ STRUCT<key STRING, value STRING>('foo bar', 'foo bar') ] ) ] )
My goal is to transform the “search.query” and “metadata.value” fields. For example, uppercasing them, leaving every other column (and every other struct field) in the row unchanged.
I’m looking for a solution involving either manually specifying each column in the SQL, or preferably, one where I can only mention the columns/fields I want to transform in the SQL, omitting all other columns/fields. This is a minimal example. The table I’m working on in production has hundreds of columns and fields.
For example, that row, when transformed this way, would change from:
[ { "date": "2021-01-01", "hits": [ { "search": { "query": "foo bar", "other_column": "foo bar" }, "metadata": [ { "key": "foo bar", "value": "foo bar" } ] } ] } ]
to:
[ { "date": "2021-01-01", "hits": [ { "search": { "query": "FOO BAR", "other_column": "foo bar" }, "metadata": [ { "key": "foo bar", "value": "FOO BAR" } ] } ] } ]
Advertisement
Answer
preferably, one where I can only mention the columns/fields I want to transform in the SQL …
Use below approach – it does exactly what you wish – ONLY those fields that are to be updated are in use, all other (tens or hundreds …) are preserved as is
update your_table set hits = array( select as struct * replace( (select as struct * replace (upper(query) as query) from unnest([search])) as search, array(select as struct * replace(upper(value) as value) from unnest(metadata)) as metadata ) from unnest(hits) ) where true;
if applied to sample data in your question – result is