Skip to content
Advertisement

Query Hierarchical Queries

I wanted to profile my data set to find the data discrepancies.

My sample date set:

id  status       stdate         enddate
1   new          01-JUL-17  31-JUL-17
1   process      01-OCT-17  31-DEC-18
1   new          01-JAN-19  31-JAN-19--- issue
2   new          01-SEP-14  31-JAN-15
2   process      01-JUN-16  30-NOV-17
2   complete     01-DEC-17  31-DEC-18
....
....

I would like to find out how many of those IDs have a result status that is older than current. The order of the status sequence should be NEW-PROCESS-COMPLETE. So I want report all IDs where the most recent status has reversed to an earlier status.

Advertisement

Answer

You can use the LAG() function to find the offending rows, as in:

with x (id, status, stdate, enddate, 
        prev_id, prev_status, prev_stdate, prev_enddate) as (
  select
    id, 
    status,
    stdate,
    enddate,
    lag(id) over(partition by id order by stdate),
    lag(status) over(partition by id order by stdate),
    lag(stdate) over(partition by id order by stdate),
    lag(enddate) over(partition by id order by stdate)
  from my_table
)
select * from x
where status = 'new' and prev_status in ('process', 'complete')
   or status = 'process' and prev_status = 'complete'

Note: I assume you need to compare only between rows of the same ID.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement