Write a query that returns all urls involved in a “bounce.” A url X is involved in a bounce if a single user navigates from a page Y to page X, then immediately returns to page Y while visiting no other pages in between. That is, you’ll want to find two visits X->Y (on date d1) and Y->X (on date d2) for the same user u, for which there are no other visits by u between d1 and d2.
You can compare dates just like numbers: d1 < d2 is true if date d1 is prior to date d2.
I self joined this table:
url | dt | uid | src | rev |
---|---|---|---|---|
A04 | 1890-05-14 | A | A01 | 10 |
A02 | 1890-05-15 | A | A04 | 15 |
A01 | 1890-05-16 | A | A04 | 20 |
This is my query:
Select * FROM Visit V1, Visit V2 WHERE V1.url = V2.src AND V1.src = V2.url AND V1.dt < V2.dt AND V1.uid = V2.uid
I understand that the single user and navigation part, and comparing the dates part.
A01 would not work here because the single user A nagivated from A01 -> A04 -> A01, however, the user visited A02 in between so it would return an empty set. How would I make it so that it return the urls only when the single user visits no other pages while navigating back and forth?
Advertisement
Answer
You can add another condition in your where clause as below:
AND NOT EXISTS (SELECT * FROM Visit visit WHERE visit.uid=V1.uid and visit.dt>V1.dt AND visit.dt<V2.dt)