Skip to content
Advertisement

How to leave only distinct rows in the table. Postgresql

I need to delete all duplicate in 40kk of rows. I’ve got table:

I’ve tried this query but after 1h of executing I’ve gave up on waiting.

Is there any other solution to delete duplicates in more optimazed way?

UPD: I need to do it just once cause I didn’t handle the duplicates at the start

Advertisement

Answer

The database systems do not like delete and update processes. If you have permission to control transactions on this table and if you do not care which id will be deleted you can follow this approach.

1.Create a table with unique values.

2.Switch the table names.

If you have a concern about which id will be deleted then you can edit window function.

Fiddle

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement