Skip to content
Advertisement

how to group the data by some interval datetimes?

there are many devices and while using it will upload data every some seconds or minutes. I want to get the sections of date-time that the device is in use

 Id  date-time                                    value
 0  2021-07-08 14:46:46                       1
 1  2021-07-08 14:47:47                       5
 2  2021-07-08 14:48:48                       2
 3  2021-07-08 14:49:49                       4
 4  2021-07-08 15:30:01                       7
 5  2021-07-08 15:30:46                       4
 6  2021-07-08 15:30:46                       4
 7  2021-07-08 15:50:04                       4
 8  2021-07-08 15:50:05                       6

can it be true that group the data by an interval? let us consider interval = 1 minutes then group the data which the minus of the two date-time is more than 1 minutes. then Id=0 or Id=1 or Id=2 or Id=3 is one group and Id=4 and Id=5 and Id=6 and Id=7 and Id=8 is another group

what I want is the group is a nearly date-time.

If the difference between two records is more than 1 minute then they are in two groups. If not they are in the same groups.

which means in the same group time1 will be smaller than 1 minutes to one of the other time.

If the time difference is 1 or 10 minutes larger than the previous record it will belong to a new groups

and I am using MYSQL

Advertisement

Answer

  • You can use lag window function to obtain previous date_time.
  • One way to calculate the time difference in seconds is to convert timestamp type to integer by unix_timestamp function.
  • Make a newgroup flag which equals one if and only if the difference from the previous record is larger than 60*10 seconds (10 minutes).
  • Cumulative sum of newgroup would become the section group ID.
with tmp AS (
  SELECT
    *,
    coalesce(unix_timestamp(date_time) - unix_timestamp(lag(date_time) over (ORDER BY date_time)), 0) > 60*10 AS newgroup
  FROM
    tbl
)
,tmp2 AS (
  SELECT 
    *,
    sum(newgroup) over (ORDER BY date_time) AS groupid 
  FROM
    tmp 
)
SELECT * FROM tmp2

This query would get:

    id   date_time  value   newgroup    groupid
    0   2021-07-08 14:46:46 1   0   0
    1   2021-07-08 14:47:47 5   0   0
    2   2021-07-08 14:48:48 2   0   0
    3   2021-07-08 14:49:49 4   0   0
    4   2021-07-08 15:30:01 7   1   1
    5   2021-07-08 15:30:46 4   0   1
    6   2021-07-08 15:30:46 4   0   1
    7   2021-07-08 15:50:04 4   1   2
    8   2021-07-08 15:50:05 6   0   2
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement