I have a huge table – 36 million rows – in SQLite3. In this very large table, there are two columns:
hash
– textd
– real
Some of the rows are duplicates. That is, both hash
and d
have the same values. If two hashes are identical, then so are the values of d
. However, two identical d
‘s does not imply two identical hash
‘es.
I want to delete the duplicate rows. I don’t have a primary key column.
What’s the fastest way to do this?
Advertisement
Answer
You need a way to distinguish the rows. Based on your comment, you could use the special rowid column for that.
To delete duplicates by keeping the lowest rowid
per (hash,d)
:
delete from YourTable where rowid not in ( select min(rowid) from YourTable group by hash , d )