Skip to content
Advertisement

Understanding window functions to deduplicate records while retaining true changes

I asked this in dba stack exchange but haven’t had any luck. Cross-posting.

SQLFIDDLE

I am close to figuring this out but I’m just stuck at a wall. I’m attempting to understand a post by Aaron Betrand and apply it to a situation I’ve encountered where I have a changes table that’s heavily duplicated due to prior design error I’m inheriting. The sample data set is identical in concept to my real data set, except SortOrder would usually be a datetime value and not an integer. The code I’ve tried is here:

The results are almost valid; however, the last row highlights a situation that I haven’t been able to account for: the date changes, the value doesn’t. I want the last record to be excluded because it’s not a true value change.

A colleague of mine suggested this code, and while I can follow how it arrives, I don’t understand why the first code sample doesn’t work. It feels to me like this would require a lot of extra parsing, and with a large data set I’d be worried about performance impacts.

Advertisement

Answer

If you just want to remove rows where the value doesn’t change you can apply this logic:

See fiddle

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement