I have a datetime partitioned table in BigQuery with a schema like:
x
timestamp TIMESTAMP REQUIRED
customer_id STRING REQUIRED
plant_id STRING REQUIRED
inverters RECORD REPEATED
inverters. id STRING REQUIRED
So, if I do a SELECT * FROM
my_table`, it will generate a JSON with a structure like:
[
{
"timestamp": "2019-10-20 20:00:00 UTC",
"customer_id": "5d3f1aee5f70740d54c416e1",
"plant_id": "5d658cb25f70747626153e83",
"inverters": [
{
"id": "5d77e3225f70746ec02bd8e7",
},
]
},
]
And that format works well in most of my use cases, except for one.
So, if I wanted the id’s of all the inverters i would do
SELECT DISTINCT id
FROM `my_table`
CROSS JOIN UNNEST(inverters)
But I also need the timestamps grouped by inverter, but if I do, let’s say, a nested query for getting the timestamps by id, like:
SELECT inverter, timestamp
FROM (
SELECT DISTINCT id
FROM `my_table`
CROSS JOIN UNNEST(inverters)
) AS inverter, `my_table`
then the resulting json would be:
[
{
"inverter": {
"id": "5d791a505f70744c233b819d"
},
"timestamp": "2020-01-21 11:15:00 UTC"
},
{
"inverter": {
"id": "5d791a505f70744c233b819d"
},
"timestamp": "2020-01-21 00:20:00 UTC"
},
{
"inverter": {
"id": "5d791a505f70744c233b819d"
},
"timestamp": "2020-01-21 04:05:00 UTC"
},
]
Now, that was a long introduction.
The question is: Is there a way of querying the data so the result of the query looks something like the following?
[
{
"inverter": {
"id": "5d791a505f70744c233b819d",
"timestamp": [
"2020-01-21 11:15:00 UTC",
"2020-01-21 00:20:00 UTC",
"2020-01-21 04:05:00 UTC",
]
},
]
Advertisement
Answer
Is this what you want?
SELECT inverter.id, ARRAY_AGG(timestamp)
FROM `my_table` t CROSS JOIN
UNNEST(t.inverters) inverter
GROUP BY inverter.id;
You say that SELECT *
returns JSON. I think this is a function of your API. When you use the BigQuery API, you don’t see JSON.