Skip to content
Advertisement

Excluding rows in a SQL query based on values in another row while preserving multiple outputs of a single part ID

I have machine output data in the form:

DATETIME            ID       VALUE
8-28-20 20:55:10    part1    13
8-28-20 20:56:60    part1    20
8-28-20 20:57:22    part1    25
8-28-20 20:59:39    part2    9
8-28-20 21:10:55    part3    33
8-28-20 21:14:30    part1    14

and I need to generate a new table by removing some rows:

DATETIME            ID       VALUE
8-28-20 20:57:22    part1    25
8-28-20 20:59:39    part2    9
8-28-20 21:10:55    part3    33
8-28-20 21:14:30    part1    14

The machine sometimes collects multiple VALUEs for each run, but I only need the last one (it’s cumulative). However, I may have multiple runs of the same ID per shift, and it’s not impossible to have 2 consecutive runs of the same ID.

Is it possible with SQL to filter out all rows where ID of a row is equal to the ID of the row above it, only if the VALUE is greater than the VALUE of the row above it?

There’s some similar Qs posted here but they all result in grouping rows and taking a max value, but then I’d only capture one run per ID for each time period.

Advertisement

Answer

You seem to want the rows where the id changes and the value increases:

select t.*
from (select t.*,
             lead(id) over (order by datetime) as next_id,
             lead(value) over (order by datetime) as next_value
      from t
     ) t
where next_id is null or next_id <> id or
      (next_id = id and next_value < value)
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement