I am in need of a database select query solution for transforming a series of breakdown details into a summarized version of it, possibly in a view. Consider the table below. It has a composite primary key of (PK_1
, PK_2
, PK_3
, and SEQUENCE_NO
).
PK_1 PK_2 PK_3 SEQUENCE_NO STATUS_CODE ======== ==== ==== =========== =========== 20200421 A 1 1 Y 20200421 A 1 2 Y 20200421 A 1 3 Y 20200421 A 1 4 N 20200421 A 1 5 Y 20200421 A 1 6 Y 20200421 A 2 7 Y 20200421 A 2 8 Y 20200421 B 3 9 Y 20200421 B 3 10 Y 20200421 B 3 11 Y 20200422 B 3 11 Y
Only including all records with the STATUS_CODE
of “Y”, how can I present the records in such a way consecutive records, respective to their composite primary keys, form ranges of value (indicated by SEQUENCE_FROM
and SEQUENCE_TO
, see below) while showing a gap that may indicate of a missing row, or a row with a STATUS_CODE
with a value other than “Y”?
PK_1 PK_2 PK_3 SEQUENCE_FROM SEQUENCE_TO ======== ==== ==== ============= =========== 20200421 A 1 1 3 20200421 A 1 5 6 20200421 A 2 7 8 20200421 B 3 9 11 20200422 B 3 11 11
I used MIN
and MAX
but obviously it wouldn’t accomodate showing the gap between ranges.
Advertisement
Answer
This is a gaps and islands problem. Here is one way to solve with ROW_NUMBER
, using the difference in row numbers method:
WITH cte AS ( SELECT t.*, SEQUENCE_NO - ROW_NUMBER() OVER (PARTITION BY PK_1, PK_2, PK_3 ORDER BY SEQUENCE_NO) AS diff FROM yourTable t WHERE STATUS_CODE = 'Y' ) SELECT PK_1, PK_2, PK_3, MIN(SEQUENCE_NO) AS SEQUENCE_FROM, MAX(SEQUENCE_NO) AS SEQUENCE_TO FROM cte GROUP BY PK_1, PK_2, PK_3, (rn1 - rn2) ORDER BY PK_1, PK_2, PK_3;
The crux of the logic being used here is that we form an on-the-fly group number for each island, within each PK_1
, PK_2
, PK_3
partition, by taking the difference between the sequence number and a ROW_NUMBER
. This difference is guaranteed to always be unique for each island.