Skip to content
Advertisement

Counting events with a specified event.value.string_value in big query

This is a follow up question to this question I have a query:

WITH
  user_summary
AS
(
  SELECT
    geo.country as country,
    platform,
    event_date,
    user_pseudo_id,
    MAX(CASE WHEN event_name = 'session_start'            THEN 1 ELSE 0 end)   AS `has_session_start`,
    MAX(CASE WHEN event_name = 'purchase_preview_page'    THEN 1 ELSE 0 end)   AS `has_purchase_preview_page`,
    MAX(CASE WHEN event_name = 'purchase_trial_activated' THEN 1 ELSE 0 end)   AS `has_purchase_trial_activated`,
    MAX(CASE WHEN event_name = 'purchase_completed'       THEN 1 ELSE 0 end)   AS `has_purchase_completed`
    
  FROM
    `project.dataset*`
  WHERE
    event_date > '20200101'
  GROUP BY
    geo.country,
    platform,
    event_date,
    user_pseudo_id

)
SELECT
  country,
  platform,
  event_date,
  SUM(has_session_start)                                      AS count_session_start,
  SUM(has_purchase_preview_page)                              AS count_purchase_preview_page,
  SUM(has_purchase_trial_activated)                           AS count_purchase_trial_activated,
  SUM(has_purchase_completed)                                 AS count_purchase_completed,
  SUM(has_purchase_trial_activated * has_purchase_completed)  AS count_trial_activated_and_purchased
FROM
  user_summary
GROUP BY
  country,
  platform,
  event_date

Which returns:

enter image description here

I now want to update the query to only count those purchases that have

event_params.value.string_value like '%subscription%'

That event is stored like so:

enter image description here

I added:

...
MAX(CASE WHEN event_name = 'purchase_completed'  and  ep.value.string_value like '%subscription%'   THEN 1 ELSE 0 end)   AS `has_purchase_completed`
    
  FROM
    `project.dataset.*`
    CROSS JOIN
    UNNEST(event_params) ep
...

Is this the correct way to count only those events that has a specified event_params.value.string_value?

I am not sure about the CROSS JOIN usage here, but that is the example I’ve seen to reach the event_params values and keys in order not to get this error:

Cannot access field value on a value with type ARRAY<STRUCT<key STRING, value STRUCT<string_value STRING, int_value INT64, float_value FLOAT64, …>>> at [3:20]

When using:

SELECT *
FROM `project.dataset.events_20210207` 
WHERE event_params.value. string_value like '%subscription%';

Advertisement

Answer

It could be faster with subquery:

...
MAX(CASE WHEN event_name = 'purchase_completed' and exists(select 1 from UNNEST(event_params) as e where e.value.string_value like '%subscription%') THEN 1 ELSE 0 end)   AS `has_purchase_completed`
....
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement