I have the following data:
Timestamp Min MAX Category 2019-03-22 08:10 12.00 5.10 AAAA 2019-03-22 08:11 10.00 5.20 AAAA 2019-03-22 08:12 11.00 4.90 AAAA 2019-03-22 08:13 12.00 5.30 BBBB 2019-03-22 08:14 14.00 5.20 BBBB 2019-03-22 08:15 9.40 5.20 CCCC 2019-03-22 08:16 7.40 5.10 CCCC 2019-03-22 08:17 14.00 5.10 AAAA 2019-03-22 08:17 14.00 5.10 AAAA
I am looking for a SQL (or Linq) query to group this data based on the Category
changes to be able to see the start and end of each category over time.
Advertisement
Answer
If I followed you correctly, you are dealing with a gaps and island problem.
In pure SQL, this is usually solved using window functions and aggregation. The difference between row numbers gives you the group each record belongs to:
SELECT category, MIN(timestamp) start_timestap, MAX(timestamp) end_timestap FROM ( SELECT t.*, ROW_NUMBER() OVER(ORDER BY Timestamp) rn1, ROW_NUMBER() OVER(PARTITION BY Category ORDER BY Timestamp) rn2 FROM mytable t ) x GROUP BY category, rn1 - rn2 ORDER BY start_timestap
In this DB Fiddle with your sample data, this returns:
| category | start_timestap | end_timestap | | -------- | ------------------- | ------------------- | | AAAA | 2019-03-22 08:10:00 | 2019-03-22 08:12:00 | | BBBB | 2019-03-22 08:13:00 | 2019-03-22 08:14:00 | | CCCC | 2019-03-22 08:15:00 | 2019-03-22 08:16:00 | | AAAA | 2019-03-22 08:17:00 | 2019-03-22 08:17:00 |
NB: this is a MySQL 8.0 DB Fiddle, however this is standard ANSI SQL that should work on any platform that supports window functions.