Skip to content
Advertisement

BigQuery finding sessions that have visited both pageA (contains keyword “main”) and pageB (contains keyword “side”)

On BQ I’m trying to find sessions that have visited both pageA (URL contains keyword “main”) and pageB (URL contains keyword “side”), and the pages that session visited. Here is my logic, I first wanted to find out sessions that have visited pageAs (URL contains keyword “main”), then I wanted to do a join, so to find out those sessions who have visited pageAs, what else pages on the site have they visited. Below is my query:

select a. sessionID, b.pagepath
from 
(SELECT
CONCAT(fullVisitorID, CAST(visitID AS string), date) AS sessionID,
 hits.page.pagepath as pagepath
 FROM
`xx.xxx.ga_sessions_*`,
UNNEST(hits) AS hits
WHERE
totals.visits = 1
and hits.page.pagepath like '%main%' 
AND _TABLE_SUFFIX BETWEEN '20220214'
AND '20220225') a
left join 
(SELECT
CONCAT(fullVisitorID, CAST(visitID AS string), date) AS sessionID,
 hits.page.pagepath as pagepath
FROM
`xx.xxx.ga_sessions_*`,
UNNEST(hits) AS hits
WHERE
totals.visits = 1 
AND _TABLE_SUFFIX BETWEEN '20220214'
AND '20220225') b 
on a.sessionID=b.sessionID
order by 1 desc

I am attaching an example output here:

sessionID Pagepath
123 /main-size
123 /main-size
456 /main-size
456 /main-size
456 /side-hide
456 /side-build
456 /June-event

In this case session 456 meets my condition because it has visited both page contains “main” and page contains “side”, however I wonder through this output how I can query to only get below output:

sessionID Pagepath
456 /main-size
456 /main-size
456 /side-hide
456 /side-build
456 /June-event

Advertisement

Answer

Consider below query.

 SELECT * EXCEPT(path)
   FROM sample_table, UNNEST([REGEXP_EXTRACT(Pagepath, r'(main|side)')]) path
QUALIFY COUNT(DISTINCT path) OVER (PARTITION BY sessionID) = 2

-- Query results
+-----+-----------+-------------+
| Row | sessionID |  Pagepath   |
+-----+-----------+-------------+
|   1 |       456 | /main-size  |
|   2 |       456 | /main-size  |
|   3 |       456 | /side-hide  |
|   4 |       456 | /June-event |
|   5 |       456 | /side-build |
+-----+-----------+-------------+
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement