“Partition by” in OVER clause groups all of the values as unique, just like “Distinct” or “Group by” do.
This is how it works in my query with row_number():
id st t row_number ------------------- 1 1 1 1 1 1 2 2 1 1 3 3 2 1 3 1 1 2 4 1 1 1 10 4
This is what I want:
id st t uniq_row_number ------------------ 1 1 1 1 1 1 2 2 1 1 3 3 2 1 3 1 1 2 4 1 1 1 10 1
No matter if new string already was before, it’s read as new partition after every change of partition. If partition repeats, so uniq_row_number gets +1. If new partition comes with new string: boom, it gets uniq_row_number 1.
My SQL query:
SELECT id, st, t, row_number() OVER (PARTITION BY id, st ORDER BY id, st) cat_num, min(t) over (PARTITION BY id, st) min_t, max(t) over (PARTITION BY id, st) max_t FROM tabl ORDER BY t;
SQL code is here: http://sqlfiddle.com/#!18/d4290/2
Advertisement
Answer
This is called a “gaps-and-islands” problem. You need to define a group for each “island” of similar values. Then you can use row_number()
.
The difference of row numbers is a convenient way to define the islands:
select t.*, row_number() over (partition by id, seqnum_t - seqnum_it order by t ) as uniq_row_number from (select t.*, row_number() over (order by t) as seqnum_t, row_number() over (partition by id order by t) as seqnum_it, from t ) t;
The best way to understand how this works is to look at the results of the subquery. You should be able to see how the difference of row numbers defines the groups that you care about.