I saw this question in one of the interviewing practice platforms and initially assumed its a gaps and islands problem, but I couldn’t go ahead with that logic. In ETL tool called Abinitio, there is a function called key_change using which we can solve these kind of problems. But, I am unable to do it in sql. Could you please help me with this
Advertisement
Answer
Wow, I haven’t heard “Ab Initio” in a long time. In any case, this does appear to be a gaps-and-islands problem, one that can be solved using the difference of row numbers. However, it can also be solved in a similar method which uses two lags and no aggregate:
select customer_id, status, effective_date, lead(effective_date, 1, '2099-12-31') over (partition by customer_id order by effective_date) as end_date from (select t.*, lag(status) over (partition by customer_id order by effective_date) as prev_status from t ) t where prev_status is null or prev_status <> status;
Basically, this picks up every row where there is a change of status and then uses the end date from the next row (where there is a change).
Your confusion over the nature of the problem may be due to how the data is presented. The dates are not ordered which can be confusing.