I have a table like this:
id | section |
---|---|
1 | 6 |
2 | 6 |
3 | 7 |
4 | 7 |
5 | 6 |
and would like to obtain a grouped list that says
section | section_nr | first_id |
---|---|---|
6 | 1 | 1 |
7 | 2 | 3 |
6 | 3 | 5 |
Using ROW_NUMBER twice I am able to obtain something close:
SELECT section, ROW_NUMBER() OVER (ORDER BY id) AS section_nr, id as first_id FROM ( SELECT id, section, ROW_NUMBER() OVER (PARTITION BY section ORDER BY id) AS nr_within FROM X ) WHERE nr_within = 1
section | section_nr | first_id |
---|---|---|
6 | 1 | 1 |
7 | 2 | 3 |
… but of course the second section 6 is missing, since PARTITION BY groups all section=6 together. Is it somehow possible to only group until the next section?
More generally (regarding GROUP BY instead of PARTITION BY), is there a simple solution to group (1,1,2,2,1) to (1,2,1) instead of (1,2)?
Advertisement
Answer
This is a typical gaps and islands problem that can be solved like this:
with u as (select id, section, case when section = lag(section) over(order by id) then 0 else 1 end as grp from X), v as (select id, section, sum(grp) over(order by id) as section_nr from u) select section, section_nr, min(id) as first_id from v group by section, section_nr;
Basically you keep tabs in a column where there is a change in section
by comparing current section
to section
from the row above (ordered by id). Whenever there is a change, set this column to 1, when no change set it to 0. The rolling sum of this column will be the section number. Getting first_id
is a simple matter of using group by
.