I’m trying to find a SQL query that lets me to merge some rows into one from a table that have a gap less than 5 seconds. For example, I have a table like the following:
x
Name | Time
==============================
John 2021-02-01 13:08:10
John 2021-02-01 13:08:12
John 2021-02-01 17:35:23
John 2021-02-07 11:16:31
Walt 2021-01-14 10:23:48
Joseph 2021-01-23 07:04:33
Walt 2021-01-14 10:23:51
Walt 2021-01-04 09:22:45
So, I want to have a result like this:
Name | Time
==============================
John 2021-02-01
John 2021-02-01
John 2021-02-07
Walt 2021-01-14
Walt 2021-01-04
Joseph 2021-01-23
For John there are two rows that have a gap less than 5 seconds, so they will merge in one row for the same day. The same happens with Walt.
Can I do this with a SQL query?
Thank you in advance.
Advertisement
Answer
You just need to check if the next date is within 5 seconds after the current row and if so – remove such a row. This can be achieved with LEAD
analytical function.
with a as (
select 'John' as name, convert(datetime, '2021-02-01 13:08:10', 120) as dt union all
select 'John' as name, convert(datetime, '2021-02-01 13:08:12', 120) as dt union all
select 'John' as name, convert(datetime, '2021-02-01 13:08:15', 120) as dt union all
select 'John' as name, convert(datetime, '2021-02-01 17:35:23', 120) as dt union all
select 'John' as name, convert(datetime, '2021-02-07 11:16:31', 120) as dt union all
select 'Walt' as name, convert(datetime, '2021-01-14 10:23:48', 120) as dt union all
select 'Joseph' as name, convert(datetime, '2021-01-23 07:04:33', 120) as dt union all
select 'Walt' as name, convert(datetime, '2021-01-14 10:23:51', 120) as dt union all
select 'Walt' as name, convert(datetime, '2021-01-04 09:22:45', 120) as dt
)
, gap_size as (
select
name,
dt,
/*Check the difference between current row and the next row per name*/
datediff(s,
dt,
lead(dt) over(partition by name order by dt asc)
) as within_5_secs_with_next
from a
)
select
name,
cast(dt as date) as dt_date
from gap_size
where coalesce(within_5_secs_with_next, 10) >= 5
order by name, dt asc
GO
name | dt_date :----- | :--------- John | 2021-02-01 John | 2021-02-01 John | 2021-02-07 Joseph | 2021-01-23 Walt | 2021-01-04 Walt | 2021-01-14
db<>fiddle here