I have the below table.
CUST_ID START_CYCLE END_CYCLE WORKER CUST_SUB_ID CUST_SUB_TYPE 101 1/1/2019 1/31/2019 ABC123 134 HIGH_SUB 101 2/1/2019 4/30/2019 ABC123 136 HIGH_SUB 101 5/1/2019 7/31/2019 ABC123 1414 HIGH_SUB 101 8/1/2019 8/30/2019 ABC123 1469 HIGH_SUB
I need to merge dates (exclude rows) when the below occurs.
Criteria:
lead cust_sub_id <> cust_sub_id
lead cust_sub_type = cust_sub_type
lead cust_id = cust_id
lead worker = worker
Final output:
CUST_ID START_CYCLE END_CYCLE WORKER CUST_SUB_ID CUST_SUB_TYPE 101 1/1/2019 8/30/2019 ABC123 134:136:1414:1469 HIGH_SUB
SQL:
select cust_id, start_cycle, end_cycle, worker, cust_sub_id, cust_sub_type, gaps_in_srv, case when gaps_in_srv = 'N' and cust_sub_id <> lead(cust_sub_id) over (partition by cust_id order by start_cycle) and cust_id = lead(cust_id) over (partition by cust_id order by start_cycle) and worker = lead(worker) over (partition by cust_id order by start_cycle) then 'Y' else 'N' end merge_dts, dense_rank() over (partition by cust_id, cust_sub_id order by start_cycle) rnk_fst_dt, dense_rank() over (partition by cust_id, cust_sub_id order by start_cycle desc) rnk_lst_dt from cust_cycle
I can use listagg to capture the cust_sub_id once I remove the rows but need help removing those rows before I apply the listagg function.
Advertisement
Answer
You can approach this as a gaps-and-islands problem using lag() to determine where an “island” starts and a cumulative sum to assign a grouping number to the island:
select cust_id, worker, cust_sub_type, min(start_cycle), max(end_cycle),
listagg(cust_sub_id, ':') within group (order by start_cycle) as cust_sub_ids
from (select cc.*,
sum(case when prev_ec <> start_cycle - interval '1' day or
prev_cst <> cust_sub_type
then 1 else 0
end) over (partition by cust_id, worker order by start_cycle) as grp
from (select cc.*,
lag(end_cycle) over (partition by cust_id, worker order by start_cycle) as prev_ec,
lag(cust_sub_type) over (partition by cust_id, worker order by start_cycle) as prev_cst
from cust_cycle cc
) cc
) cc
group by cust_id, worker, cust_sub_type, worker