I have the following data which I want to group into seconds per day in BigQuery.
Source Table:
+--------------+---------------------+---------------------+ | ComputerName | StartDatetime | EndDatetime | +--------------+---------------------+---------------------+ | Computer1 | 2020-06-10T21:01:28 | 2020-06-10T21:20:19 | +--------------+---------------------+---------------------+ | Computer1 | 2020-06-10T22:54:01 | 2020-06-11T05:21:48 | +--------------+---------------------+---------------------+ | Computer2 | 2020-06-08T09:11:54 | 2020-06-10T11:36:27 | +--------------+---------------------+---------------------+
I want to be able to visualise the data in the following way
+------------+--------------+------------------+ | Date | ComputerName | Runtime(Seconds) | +------------+--------------+------------------+ | 2020-10-10 | Computer1 | 5089 | +------------+--------------+------------------+ | 2020-10-11 | Computer1 | 19308 | +------------+--------------+------------------+ | 2020-10-08 | Computer2 | 53285 | +------------+--------------+------------------+ | 2020-10-09 | Computer2 | 86400 | +------------+--------------+------------------+ | 2020-10-10 | Computer2 | 41787 | +------------+--------------+------------------+
I am not too sure of the way I should approach this. Some input would be greatly appreciated.
Advertisement
Answer
This is an interval overlap problem. You can solve this by splitting each time period into separate days and then looking at the overlap for each day:
with t as ( select 'Computer1' as computername, datetime '2020-06-10T21:01:28' as startdatetime, datetime '2020-06-10T21:20:19' as enddatetime union all select 'Computer1' as computername, datetime '2020-06-10T22:54:01' as startdatetime, datetime '2020-06-11T05:21:48' as enddatetime union all select 'Computer2' as computername, datetime '2020-06-08T09:11:54' as startdatetime, datetime '2020-06-10T11:36:27' as enddatetime ) select dte, t.computername, sum(case when enddatetime >= dte and startdatetime < date_add(dte, interval 1 day) then datetime_diff(least(date_add(dte, interval 1 day), enddatetime), greatest(dte, startdatetime), second) end) as runtime_seconds from (select t.*, generate_date_array(date(t.startdatetime), date(t.enddatetime), interval 1 day) gda from t ) t cross join unnest(gda) dte group by dte, t.computername;