I have a datetime partitioned table in BigQuery with a schema like:
timestamp TIMESTAMP REQUIRED customer_id STRING REQUIRED plant_id STRING REQUIRED inverters RECORD REPEATED inverters. id STRING REQUIRED ...
So, if I do a SELECT * FROMmy_table`, it will generate a JSON with a structure like:
[
{
"timestamp": "2019-10-20 20:00:00 UTC",
"customer_id": "5d3f1aee5f70740d54c416e1",
"plant_id": "5d658cb25f70747626153e83",
"inverters": [
{
"id": "5d77e3225f70746ec02bd8e7",
...
},
...
]
},
...
]
And that format works well in most of my use cases, except for one.
So, if I wanted the id’s of all the inverters i would do
SELECT DISTINCT id FROM `my_table` CROSS JOIN UNNEST(inverters)
But I also need the timestamps grouped by inverter, but if I do, let’s say, a nested query for getting the timestamps by id, like:
SELECT inverter, timestamp FROM ( SELECT DISTINCT id FROM `my_table` CROSS JOIN UNNEST(inverters) ) AS inverter, `my_table`
then the resulting json would be:
[
{
"inverter": {
"id": "5d791a505f70744c233b819d"
},
"timestamp": "2020-01-21 11:15:00 UTC"
},
{
"inverter": {
"id": "5d791a505f70744c233b819d"
},
"timestamp": "2020-01-21 00:20:00 UTC"
},
{
"inverter": {
"id": "5d791a505f70744c233b819d"
},
"timestamp": "2020-01-21 04:05:00 UTC"
},
...
]
Now, that was a long introduction.
The question is: Is there a way of querying the data so the result of the query looks something like the following?
[
{
"inverter": {
"id": "5d791a505f70744c233b819d",
"timestamp": [
"2020-01-21 11:15:00 UTC",
"2020-01-21 00:20:00 UTC",
"2020-01-21 04:05:00 UTC",
...
]
},
...
]
Advertisement
Answer
Is this what you want?
SELECT inverter.id, ARRAY_AGG(timestamp)
FROM `my_table` t CROSS JOIN
UNNEST(t.inverters) inverter
GROUP BY inverter.id;
You say that SELECT * returns JSON. I think this is a function of your API. When you use the BigQuery API, you don’t see JSON.