Skip to content
Advertisement

Determine if an event happened every hour for 24 straight hours

Edit – I am still digesting the great responses and learning how they work, awesome knowledge!

Using SQL Server 2016, I am trying to figure out if any users have done something at least once every hour, for 24 consecutive hours.

I have a table similar to this, in the example below, the only user who qualifies is ‘21482’. They actually have more than one event an hour (which is fine) but they have done at least one thing every hour for 24 straight hours. They also have events on 8-22 but they would not qualify as they are not consecutive (i.e. there’s a gap between the 8-22 events and 8-23 events).

The other user ‘45578’, does not qualify, even though they have more than 24 events in a single day, there was not activity during every hour for 24 straight hours. It doesn’t matter if the activity spans more than a single day, that is fine.

Ideas on how to accomplish something like this?

CREATE TABLE AuditData
    ([UserID] varchar(20), [Event_Timestamp] datetime)
;

INSERT INTO AuditData
    ([UserID], [Event_Timestamp])
VALUES
(21482, '2021-08-22 00:15:30.000'),
(21482, '2021-08-22 00:25:30.000'),
(21482, '2021-08-22 02:05:45.000'),
(21482, '2021-08-22 04:15:15.000'),
(21482, '2021-08-22 06:25:10.000'),
(21482, '2021-08-22 08:01:05.000'),
(21482, '2021-08-23 00:05:30.000'),
(21482, '2021-08-23 01:05:45.000'),
(21482, '2021-08-23 02:15:15.000'),
(21482, '2021-08-23 03:25:10.000'),
(21482, '2021-08-23 04:01:05.000'),
(21482, '2021-08-23 05:55:55.000'),
(21482, '2021-08-23 06:50:58.000'),
(21482, '2021-08-23 07:30:02.000'),
(21482, '2021-08-23 08:10:12.000'),
(21482, '2021-08-23 08:14:12.000'),
(21482, '2021-08-23 09:18:30.000'),
(21482, '2021-08-23 10:17:32.000'),
(21482, '2021-08-23 11:30:21.000'),
(21482, '2021-08-23 11:32:21.000'),
(21482, '2021-08-23 12:45:30.000'),
(21482, '2021-08-23 13:01:30.000'),
(21482, '2021-08-23 14:59:30.000'),
(21482, '2021-08-23 15:44:30.000'),
(21482, '2021-08-23 16:49:30.000'),
(21482, '2021-08-23 17:50:30.000'),
(21482, '2021-08-23 18:10:30.000'),
(21482, '2021-08-23 19:02:45.000'),
(21482, '2021-08-23 20:11:50.000'),
(21482, '2021-08-23 21:55:15.000'),
(21482, '2021-08-23 22:16:55.000'),
(21482, '2021-08-23 23:58:10.000'),
(45578, '2021-08-23 05:05:30.000'),
(45578, '2021-08-23 05:05:45.000'),
(45578, '2021-08-23 05:15:15.000'),
(45578, '2021-08-23 05:25:10.000'),
(45578, '2021-08-23 05:41:05.000'),
(45578, '2021-08-23 05:55:55.000'),
(45578, '2021-08-23 06:50:58.000'),
(45578, '2021-08-23 07:30:02.000'),
(45578, '2021-08-23 08:10:12.000'),
(45578, '2021-08-23 09:18:30.000'),
(45578, '2021-08-23 10:17:32.000'),
(45578, '2021-08-23 11:30:21.000'),
(45578, '2021-08-23 11:31:21.000'),
(45578, '2021-08-23 12:32:21.000'),
(45578, '2021-08-23 13:33:21.000'),
(45578, '2021-08-23 13:33:21.000'),
(45578, '2021-08-23 13:33:21.000'),
(45578, '2021-08-23 14:33:21.000'),
(45578, '2021-08-23 14:33:21.000'),
(45578, '2021-08-23 14:33:21.000'),
(45578, '2021-08-23 14:33:21.000'),
(45578, '2021-08-23 15:33:21.000'),
(45578, '2021-08-23 15:33:21.000'),
(45578, '2021-08-23 15:33:21.000'),
(45578, '2021-08-23 15:33:21.000'),
(45578, '2021-08-23 15:33:21.000'),
(45578, '2021-08-23 16:33:21.000'),
(45578, '2021-08-23 16:33:21.000')
;

Thanks for any suggestions.

Advertisement

Answer

This is a type of gaps-and-islands problem.

There are many solutions.

  • One of the classic ones is to use LAG to check if the current row is the start of a new group, assign group IDs using a running COUNT, then group by this new ID

To truncate to the beginning of the hour, we use DATEADD(hour, DATEDIFF(hour, 0, ad.Event_Timestamp), 0))

WITH StartPoints AS (
    SELECT ad.UserID, v.Hour,
      IsStart = CASE WHEN LAG(v.Hour) OVER (PARTITION BY ad.UserID ORDER BY v.Hour)
                >= DATEADD(hour, -1, v.Hour) THEN NULL ELSE 1 END
    FROM AuditData ad
    CROSS APPLY(VALUES(DATEADD(hour, DATEDIFF(hour, 0, ad.Event_Timestamp), 0))) v(Hour)
    GROUP BY ad.UserID, v.Hour
),
Groupings AS (
    SELECT *,
      GroupId = COUNT(IsStart) OVER (PARTITION BY ad.UserID ORDER BY ad.Hour ROWS UNBOUNDED PRECEDING)
    FROM StartPoints ad
)
SELECT DISTINCT ad.UserID
FROM Groupings ad
GROUP BY ad.UserID, ad.GroupId
HAVING COUNT(*) >= 24;
  • Another solution is, which works here because all we are looking for is the existence of an island, is to use LAG with an offset of 23 to check whether that row is exactly 23 hours before current. Then we simply group by UserID
WITH Hours AS (
    SELECT ad.UserID, v.Hour,
        Hour24 = LAG(v.Hour, 23) OVER (PARTITION BY ad.UserID ORDER BY v.Hour)
    FROM AuditData ad
    CROSS APPLY(VALUES(DATEADD(hour, DATEDIFF(hour, 0, ad.Event_Timestamp), 0))) v(Hour)
    GROUP BY ad.UserID, v.Hour
)
SELECT DISTINCT ad.UserID
FROM Hours ad
WHERE ad.Hour24 = DATEADD(hour, -23, ad.Hour);

db<>fiddle

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement