Skip to content
Advertisement

How can I find duplicate records in clickhouse [closed]

I want to know how I can find duplicate data entries within one table in clickhouse.

I am actually investigating on a merge tree table and actually threw optimize statements at my table but that didn’t do the trick. The duplicate entries still persist.

Preferred would be to have a universal strategy without referencing individual column names.

I only want to see the duplicate entries, since I am working on very large tables.

Advertisement

Answer

The straight forward way would be to run this query.

If that query gets to big you can run it in pieces.

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement