Skip to content
Advertisement

Selecting the first and last event per user, per day

I have a Google Analytics event which fires on my website when certain interactions are made, this may or may not fire for a user in a session, or can fire many times.

I’d like to return results showing the userID and the value of the first and last event label, per day. I have tried to do this with MAX(hits.eventInfo.eventLabel), but when I fact check my results this is not returning the last value for that user in the day as I was expecting.

SELECT Date,
customDimension.value AS UserID,
MAX(hits.eventInfo.eventLabel) AS last_value
FROM `project.dataset.ga_sessions_20*` AS t
  CROSS JOIN UNNEST(hits) AS hits
  CROSS JOIN UNNEST(t.customdimensions) AS customDimension
WHERE parse_date('%y%m%d', _table_suffix) between 
DATE_sub(current_date(), interval 1 day) and
DATE_sub(current_date(), interval 1 day)
AND hits.eventInfo.eventAction = "Value"
AND customDimension.index = 2
GROUP BY Date, UserID

For example, the query above returns results where user X has the following MAX() value:

20180806 User_x 69.96

But when I look at the details of that users interactions on the day I see:

enter image description here

Based on this, I would expect to see 79.95 as my MAX() result as it has the highest hit number, instead I seem to have selected a value from somewhere in the middle of the session – how can I adjust my query to ensure I select the last event value?

Advertisement

Answer

When you are looking for maximum value of column colA while doing GROUP BY – obviously MAX(colA) will work

But when you are looking for value in column colA based on maximum value in column colB – you should use STRING_AGG(colA ORDER BY colB DESC LIMIT 1) or similar using ARRAY_AGG()

So, in you case, I think it will be something like below (you should tune it further)

STRING_AGG(eventInfo.eventLabel ORDER BY hiNumber DESC LIMIT 1) AS last_value
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement