Skip to content
Advertisement

duplicate elimination with time

For my Smart home Stuff I create a Database, but I make a mistake during programming: the application posting stuff into the Database twice. I want to delete all rows, which contain duplicates. With duplicate I mean a tuples what is identically in the data to the last one from the same type. I mark the duplicates in this Example with “<<” please pay also attention to the last 3 rows. I want to keep the first new Data so I want to delete all the Duplicate after them. I still hope you can help me to solve my Problem.

If i run

the output is:

thats not what i mean :/ (ßorry dont know how to mark the stuff as code block in the comment therfore this way

Advertisement

Answer

There are many possible ways to do this. The fastest is often a correlated subquery but I can never remember the syntax so I normally go for window functions, specifically row_number().

If you run

That should give a version of your table where all the rows you want to keep have a number 1 and the duplicates are numbered 2,3,4… Use that same field in a delete query and you’ll be sorted.

EDIT: I understand now you only want to remove duplicates that occur sequentially within the same type. This can also be achieved using row_number and join.This query should give you only the data you want.

This might need a slight tweak to avoid missing the very first entry if that’s important.

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement