I have a Google Analytics event which fires on my website when certain interactions are made, this may or may not fire for a user in a session, or can fire many times.
I’d like to return results showing the userID and the value of the first and last event label, per day. I have tried to do this with MAX(hits.eventInfo.eventLabel), but when I fact check my results this is not returning the last value for that user in the day as I was expecting.
SELECT Date, customDimension.value AS UserID, MAX(hits.eventInfo.eventLabel) AS last_value FROM `project.dataset.ga_sessions_20*` AS t CROSS JOIN UNNEST(hits) AS hits CROSS JOIN UNNEST(t.customdimensions) AS customDimension WHERE parse_date('%y%m%d', _table_suffix) between DATE_sub(current_date(), interval 1 day) and DATE_sub(current_date(), interval 1 day) AND hits.eventInfo.eventAction = "Value" AND customDimension.index = 2 GROUP BY Date, UserID
For example, the query above returns results where user X has the following MAX() value:
20180806 User_x 69.96
But when I look at the details of that users interactions on the day I see:
Based on this, I would expect to see 79.95 as my MAX() result as it has the highest hit number, instead I seem to have selected a value from somewhere in the middle of the session – how can I adjust my query to ensure I select the last event value?
Advertisement
Answer
When you are looking for maximum value of column colA
while doing GROUP BY – obviously MAX(colA)
will work
But when you are looking for value in column colA
based on maximum value in column colB
– you should use STRING_AGG(colA ORDER BY colB DESC LIMIT 1)
or similar using ARRAY_AGG()
So, in you case, I think it will be something like below (you should tune it further)
STRING_AGG(eventInfo.eventLabel ORDER BY hiNumber DESC LIMIT 1) AS last_value