I’m trying to calculate the time spent by users on my platform on a weekly basis. I have the following api_request_logs table (*Note: each row is duplicate)
x
time user_id url
------------------------ ------- ----
January 6,2020, 08:56 PM 14 apiv3user1
January 6,2020, 08:58 PM 14 apiv3session
January 6,2020, 08:59 PM 14 apiv3notification
January 6,2020, 09:01 PM 14 apiv3announcements
January 6,2020, 09:02 PM 14 apiv3user2
January 6,2020, 09:28 PM 14 apiv3session
[enter image description here][1]
I tried this query, however, it doesn’t seems it is outputting correct calculation .
select SUM(timespent) timespent, a.user_id as user
from (
select timestamp_diff(max(time),min(time),minute) timespent, DATE_TRUNC(EXTRACT(date from time), day) as Time, a.user_id as user
from (
select * , (countif (start) over (order by time)) as grp from (
select *, ifnull (lag(url) over(order by time), '') !=url as start from logs.api_request_logs a
INNER JOIN logs.cohort_members on a.user_id = cohort_members.user_id
INNER JOIN `logs.cohorts` on cohort_members.cohort_id = `logs.cohorts`.cohort_id
INNER Join logs.users on cohort_members.user_id = users.user_id
WHERE a.user_id = 14 and time > timestamp ('2020-01-01 00:00:00')
)) group by user, grp)
group by 2
order by 2 asc
Current output:
timespent users Time
--------- ------- -----
32 min 14 January 6,2020, 12:00 AM
Desired output :
Timespent users Time
---------- -------- -----
6 min 14 January 6,2020, 12:00 AM
What I’m doing wrong or do you have a better idea to do it? Thanks
Advertisement
Answer
I think you just want to group by your url as well … try the following 🙂
#standardsql :
SELECT
SUM(timespent) timespent,
a.user_id AS user
FROM (
SELECT
TIMESTAMP_DIFF(MAX(time),MIN(time),minute) timespent,
DATE_TRUNC(EXTRACT(date FROM time), day) AS Time,
a.user_id AS user
,url
FROM (
SELECT
*,
(countif (start) OVER (ORDER BY time)) AS grp
FROM (
SELECT
*,
ifnull (LAG(url) OVER(ORDER BY time),
'') !=url AS start
FROM
logs.api_request_logs a
INNER JOIN
logs.cohort_members
ON
a.user_id = cohort_members.user_id
INNER JOIN
`logs.cohorts`
ON
cohort_members.cohort_id = `logs.cohorts`.cohort_id
INNER JOIN
logs.users
ON
cohort_members.user_id = users.user_id
WHERE
a.user_id = 14
AND time > timestamp ('2020-01-01 00:00:00') ))
GROUP BY
user,
gr,url)
GROUP BY
2
ORDER BY
2 ASC