I’m linking Google Analytics 4 events into a BigQuery table. I can retrieve data just based on one key, but how do I get the value stored in another key in the same record as that key?
In specific, I want to rank the number of views by the name of the article, and then provide the author’s name in a separate column as a supplementary data (the author’s name is stored in the same record with a different key in a nested column).
Environment
Google Analytics 4 Set up events in Google Tag Manager
The table schema looks like this, with keys such as article_name
author_name
in event_params.key
and the value you want to get in event_params.value.string_value
.
And the table preview looks like this:
+-----+------------+-----------------+--------------+------------------+---------------------------------+ | Row | event_date | event_timestamp | event_name | event_params.key | event_params.value.string_value | +-----+------------+-----------------+--------------+------------------+---------------------------------+ | 1 | 20201127 | 160394324324231 | view_article | article_name | My Article A | | | | | | author_name | Author A | | | | | | random key1 | random value1 | | | | | | random key2 | random value2 | | 2 | 20201127 | 160394324324112 | view_article | article_name | My Article B | | | | | | author_name | Author B | | | | | | random key1 | random value3 | | | | | | random key2 | random value4 | ... +-----+------------+-----------------+--------------+------------------+---------------------------------+
What I tried
Was able to pull off the article ranking itself, without the author name.
#standardSQL WITH _data AS ( SELECT value.string_value AS article_name FROM `my-new-project.analytics_000000000.events_*`, UNNEST(event_params) WHERE event_name = 'article_view' ) SELECT article_name, COUNT(*) AS cnt FROM _data GROUP BY 1 ORDER BY 2 DESC
Result:
+-----+--------------+-----+ | Row | article_name | cnt | +-----+--------------+-----+ | 1 | My Article A | 20 | | 2 | My Article D | 18 | | 3 | My Article C | 11 | | 4 | My Article B | 9 | ... +-----+--------------+-----+
I wanted to add a column for author_name
next to article_name
here, so I thought it would be a good idea to use CASE WHEN.
But as it turns out, the author_name
will all be null, which probably means it is treated as a separate record.
#standardSQL WITH _data AS ( SELECT CASE WHEN key = 'article_name' THEN value.string_value END AS article_name, CASE WHEN key = 'author_name' THEN value.string_value END AS author_name FROM `my-new-project.analytics_000000000.events_*`, UNNEST(event_params) WHERE key = 'article_name' ) SELECT article_name, MAX(author_name), COUNT(*) AS cnt FROM _data GROUP BY 1 ORDER BY 3 DESC
Result:
+-----+--------------+-------------+-----+ | Row | article_name | author_name | cnt | +-----+--------------+-------------+-----+ | 1 | My Article A | null | 20 | | 2 | My Article D | null | 18 | | 3 | My Article C | null | 11 | | 4 | My Article B | null | 9 | ... +-----+--------------+-------------+-----+
When I GROUP BY with author_name
in a descending order, the author’s name appears correctly, but this time the article_name
is all null. Is it possible to have both article_name and author_name in the same record, and have the author name next to article name in the same ranking result?
Advertisement
Answer
I think you’re looking for a sub-select solution:
SELECT (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'article_name') AS article_name, (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'author_name') AS author_name, count(1) as cnt FROM `my-new-project.analytics_000000000.events_*` GROUP BY 1,2 ORDER BY 3 DESC