Removing near identical values from mysql table

Question

Is there a way of removing near identical values from a table in mysql? My table has records more than 10K out of which one of the company looks like this: on using describe tablename I get this: the names of the company are same however I would like to delete the second instance from table, thereby keeping j…

Accepted Answer

You could try using soundex to find the &#8220;near identical&#8221; values &#8211;SELECT *FROM tablename t1INNER JOIN tablename t2    ON t1.id < t2.id    AND SOUNDEX(t1.name) = SOUNDEX(t2.name)You will need to test it with some of your example &#8220;near identical&#8221; values to see what it does and does not work for.  As suggested by Akina you will probably need to go for some kind of normalisation process (stored function) or the Levenshtein distance function linked by Slava.

Advertisement

Answer