Skip to content
Advertisement

Is it possible to get active sessions per hour in SQL?

start_time end_time HostID gameID
6/14/2021 20:13 6/14/2021 22:22 1 AB1
6/14/2021 20:20 6/14/2021 21:47 2 AB2
6/14/2021 20:22 6/14/2021 22:07 3 AB3
6/14/2021 20:59 6/14/2021 21:15 4 AB4
6/15/2021 21:24 6/15/2021 22:09 1 AB5
6/15/2021 21:24 6/15/2021 21:59 2 AB6
6/15/2021 23:11 6/16/2021 01:22 4 AB7
6/16/2021 20:13 6/16/2021 21:23 3 AB8

I have a table that has a start time and end time. I want to count the active game rooms per hour. I know I should have at least try solving this but I really don’t know where to start or if is this even possible with SQL.

What I first did is to COUNT how many gameID there are in an hour using the start_time. But I’m sure I did not answer the ‘active’ per session question. What I did was to just COUNT how many has started a game per hour.

The expected result is something like this

DayHour

Time Active
6/14/2021 2000 4
6/15/2021 2100 4
6/16/2021 2200 2
6/15/2021 2100 2
6/16/2021 2200 1
6/17/2021 2300 1
6/16/2021 0 1
6/17/2021 1 1
6/18/2021 2000 1
6/19/2021 2100 1

Or count of active sessions per hour without grouping by day.

Hr

Time Active
2000 5
2100 7
2200 3
2300 1
0000 1
0001 1

Advertisement

Answer

I suggest below solution

select timestamp_trunc(minute, hour) hour, 
  count(distinct hostid) hosts,
  count(distinct gameid) games
from `project.dataset.table`, 
unnest(generate_timestamp_array(
  parse_timestamp('%m/%d/%Y %H:%M', start_time), 
  parse_timestamp('%m/%d/%Y %H:%M', end_time), 
  interval 1 minute)) minute
group by hour
# order by hour          

if applied to sample data in your question (with fix for end_time in last row – should be 6/16/2021 21:23 – not 6/6/2021 21:23) – output is

enter image description here

Brief explanation

  1. Expand/split each original row into one row for each minute between start_time and end_time
  2. And then, simply aggregate by hour applying count(distinct ...)

Or count of active sessions per hour without grouping by day.

You can apply exactly same approach

select extract(hour from minute) hour, 
  count(distinct hostid) hosts,
  count(distinct gameid) games
from `project.dataset.table`, 
unnest(generate_timestamp_array(
  parse_timestamp('%m/%d/%Y %H:%M', start_time), 
  parse_timestamp('%m/%d/%Y %H:%M', end_time), 
  interval 1 minute)) minute
group by hour
order by hour             

with output

enter image description here

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement