Skip to content
Advertisement

Most efficient way to delete duplicate entries from MySQL tables

We have a table (let us call it originalTbl) that has duplicate entries that we want to delete. By duplicate I mean all values other than an AUTO INCREMENT index field are the same. One way to do this is to create a new table, like the existing table (let us call it uniqueTbl), and then have a query like:

INSERT INTO uniqueTbl (non-Index-field_1, non-Index-field_2, …, non-Index-field_n)
SELECT DISTINCT non-Index-field_1, non-Index-field_2, …, non-Index-field_n FROM originalTbl;

Later on we will drop originalTbl and rename uniqueTbl to originalTbl.


However, I am looking for an alternate approach, that will delete the duplicate entries from originalTbl directly, without the overhead of first creating the uniqueTbl and then renaming it to originalTbl.

Advertisement

Answer

Unless you have very few duplicates, your method will be much, much faster. If you only have a few (say less than 1%), then you can try:

delete o
    from originalTbl o left join
         (select col1, col2, . . ., min(id) as min_id
          from originalTbl o
          group by col1, col2, . . .
         ) oo
         on oo.min_id = o.id
    where oo.min_id is null;
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement