I have dimensions.key_value
of RECORD type i run the following query with following output.
SELECT * from table; event_id value dimensions 1 140 {"key_value": [{"key": "app", "value": "20"}]} 2 150 {"key_value": [{"key": "region", "value": "8"}, {"key": "loc", "value": "1"}]} 3 600 {"key_value": [{"key": "region", "value": "8"}, {"key": "loc", "value": "2"}]}
To unnest the data i have created the following view:
with temp as ( select (select value from t.dimensions.key_value where key = 'region') as region, (select value from t.dimensions.key_value where key = 'loc') as loc, (select value from t.dimensions.key_value where key = 'app') as app, value, event_id from table t ) select * from temp;
My Output:
region loc app count event_id null null 20 140 1 8 1 null 150 2 8 2 null. 600. 3
There are two thing i need to verify is my query correct ?
How i can make the query generic i.e if i don’t know all the key
, there some other keys may also be present in our dataset ?
Update:
Problem : Let says a user want to do group by using region
and loc
so there is no easy way of writing the query for that i decided create a view so user can easily do group by
with temp as ( select (select value from t.dimensions.key_value where key = 'region') as region, (select value from t.dimensions.key_value where key = 'loc') as loc, (select value from t.dimensions.key_value where key = 'store') as store, value, metric_name, event_time from table t ) select * from temp;
Based on this view the user can easily do group by. So i wanted to check if their is way to create generic view since we don’t know the all the unique key
or is there a easy way to do groupby.
Advertisement
Answer
How i can make the query generic i.e if i don’t know all the key, there some other keys may also be present in our dataset ?
Consider below
execute immediate (select ''' select event_id, value, ''' || string_agg(''' (select value from b.key_value where key = "''' || key_name || '''") as ''' || key_name , ''', ''') || ''' from ( select event_id, value, array( select as struct json_extract_scalar(kv, '$.key') key, json_extract_scalar(kv, '$.value') value from a.kvs kv ) key_value from `project.dataset.table`, unnest([struct(json_extract_array(dimensions, '$.key_value') as kvs)]) a ) b ''' from ( select distinct json_extract_scalar(kv, '$.key') key_name from `project.dataset.table`, unnest(json_extract_array(dimensions, '$.key_value')) as kv ) )
if applied to sample data in your question – ooutput is
As you can see in query – there is no any explicit references to actual key names – rather they are dynamically extracted – so no need to know them in advance and there can be any number of them too