I have the following data:
x
Timestamp Min MAX Category
2019-03-22 08:10 12.00 5.10 AAAA
2019-03-22 08:11 10.00 5.20 AAAA
2019-03-22 08:12 11.00 4.90 AAAA
2019-03-22 08:13 12.00 5.30 BBBB
2019-03-22 08:14 14.00 5.20 BBBB
2019-03-22 08:15 9.40 5.20 CCCC
2019-03-22 08:16 7.40 5.10 CCCC
2019-03-22 08:17 14.00 5.10 AAAA
2019-03-22 08:17 14.00 5.10 AAAA
I am looking for a SQL (or Linq) query to group this data based on the Category
changes to be able to see the start and end of each category over time.
Advertisement
Answer
If I followed you correctly, you are dealing with a gaps and island problem.
In pure SQL, this is usually solved using window functions and aggregation. The difference between row numbers gives you the group each record belongs to:
SELECT
category,
MIN(timestamp) start_timestap,
MAX(timestamp) end_timestap
FROM (
SELECT
t.*,
ROW_NUMBER() OVER(ORDER BY Timestamp) rn1,
ROW_NUMBER() OVER(PARTITION BY Category ORDER BY Timestamp) rn2
FROM mytable t
) x
GROUP BY category, rn1 - rn2
ORDER BY start_timestap
In this DB Fiddle with your sample data, this returns:
| category | start_timestap | end_timestap |
| -------- | ------------------- | ------------------- |
| AAAA | 2019-03-22 08:10:00 | 2019-03-22 08:12:00 |
| BBBB | 2019-03-22 08:13:00 | 2019-03-22 08:14:00 |
| CCCC | 2019-03-22 08:15:00 | 2019-03-22 08:16:00 |
| AAAA | 2019-03-22 08:17:00 | 2019-03-22 08:17:00 |
NB: this is a MySQL 8.0 DB Fiddle, however this is standard ANSI SQL that should work on any platform that supports window functions.