How to leave only distinct rows in the table. Postgresql

Question

I need to delete all duplicate in 40kk of rows. I&#8217;ve got table: I&#8217;ve tried this query but after 1h of executing I&#8217;ve gave up on waiting. Is there any other solution to delete duplicates in more optimazed way? UPD: I need to do it just once cause I didn&#8217;t handle the duplicates at the st…

Accepted Answer

The database systems do not like delete and update processes. If you have permission to control transactions on this table and if you do not care which id will be deleted you can follow this approach.1.Create a table with unique values.2.Switch the table names.create  table players_tmp as select match_id, account_id, win, id from (select match_id, account_id, win, id,rank() OVER (PARTITION BY match_id, account_id ORDER BY id) as rnfrom players) r where rn = 1;alter table players rename to players_tmp2;alter table players_tmp rename to players;If you have a concern about which id will be deleted then you can edit window function.Fiddle

Advertisement

Answer