I wanted to profile my data set to find the data discrepancies.
My sample date set:
id status stdate enddate 1 new 01-JUL-17 31-JUL-17 1 process 01-OCT-17 31-DEC-18 1 new 01-JAN-19 31-JAN-19--- issue 2 new 01-SEP-14 31-JAN-15 2 process 01-JUN-16 30-NOV-17 2 complete 01-DEC-17 31-DEC-18 .... ....
I would like to find out how many of those IDs have a result status that is older than current. The order of the status sequence should be NEW-PROCESS-COMPLETE. So I want report all IDs where the most recent status has reversed to an earlier status.
Advertisement
Answer
You can use the LAG()
function to find the offending rows, as in:
with x (id, status, stdate, enddate, prev_id, prev_status, prev_stdate, prev_enddate) as ( select id, status, stdate, enddate, lag(id) over(partition by id order by stdate), lag(status) over(partition by id order by stdate), lag(stdate) over(partition by id order by stdate), lag(enddate) over(partition by id order by stdate) from my_table ) select * from x where status = 'new' and prev_status in ('process', 'complete') or status = 'process' and prev_status = 'complete'
Note: I assume you need to compare only between rows of the same ID
.