SQL: Pull rows based on sequence of values

Question

I need to pull rows of data based on the existence of certain values that exist in a specific sequence. Here's an example of the data: Header EventId EventDate 67891882 382 2022-01-21 09:29:50.000 67891882 81 2022-01-21 09:03:23.000 67891882 273 2022-01-21 09:03:51.000 67891882 77 2022-01-21 09:05:58.000 67891882 2 2022-01-21 09:29:48.000 The results I need are to capture the Header and the

Accepted Answer

I&#8217;m going to assume that a start event (81) always starts a new &#8220;frame&#8221; from that row onwards, and an end event (77) always starts a new frame from the following row onwards.I&#8217;m also going to assume that you&#8217;re only interested in frames where both a start and end event are present, and that the frame contains no excepted events (I&#8217;ll just use 00 as random allowable events and 199 as the only excepted event).For example&#8230;[81,00,81,00,77,00,81,199,77]=> frame 0 = [81,00]=> frame 1 = [81,00,77]=> frame 2 = [00]=> frame 3 = [81,199,77]In that example only the 2nd frame&#8217;s start event would be returned (the others missing start and/or end events, or containing the excepted event).WITH  frame_start AS(  SELECT    *,    CASE      WHEN        81 = eventid      OR        77 = LAG(eventid) OVER (PARTITION BY header ORDER BY eventdate)      THEN        1      ELSE        0    END      AS new_frame  FROM    #data),  framed AS(  SELECT    *,    SUM(new_frame) OVER (PARTITION BY header ORDER BY eventdate) AS frame_id  FROM    frame_start)SELECT  header, MIN(eventdate)FROM  framedGROUP BY  header, frame_idHAVING  SUM(CASE WHEN eventid IN (81,77) THEN 1 ELSE 0 END) = 2  AND  MAX(CASE WHEN eventid IN (199, etc) THEN 1 ELSE 0 END) = 0Demo : https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=d54493c87629e3e59759ac9d119ec6adExplanation:The first CTE adds a column called new_frame.1 = current row is 81,or previous row is 770 = everything elseThis marks the start of each new frame (as described at the top here).The next CTE assigns an id to every row in each frame, by cumulatively summing the new_frame, in datetime order. The id starts at 0, then is incremented on each row by that row&#8217;s new_frame value (if new_frame=0, keep the same id as the previous row, if new_frame=1 increment the id by 1).At this point the header&#8217;s rows are broken down in to frames (as described at the top here).The final query groups by the frame and then filters the results with a HAVING clause.  The first check is that the number of rows in the frame with 81 or 77 must total 2. The second check is that no rows in the frame can have an excepted event. If all checks pass, return the minimum timestamp in the frame, which by definition comes from the first row in the frame.

Header	EventId	EventDate
67891882	382	2022-01-21 09:29:50.000
67891882	81	2022-01-21 09:03:23.000
67891882	273	2022-01-21 09:03:51.000
67891882	77	2022-01-21 09:05:58.000
67891882	2	2022-01-21 09:29:48.000

Header	EventId	EventDate
62252595	81	5/23/2021 12:34:03 PM
65252595	81	5/23/2021 12:39:16 PM

Advertisement

Answer