Is there a way to delete all the duplicate rows from an arbitrary table in a database when you don’t know the table’s structure (it’s columns) using only plain SQL (without using PL/SQL)?
-- The code should look something like this DELETE FROM tableName -- tableName can be changed WHERE ...
So, as an example, for the following table:
.-------. | A | B | |---|---| | 1 | 1 | -- (1) | 2 | 1 | -- (2) | 2 | 1 | -- (3) duplicate of (2) | 3 | 2 | -- (4) | 3 | 2 | -- (5) duplicate of (4) | 3 | 2 | -- (6) duplicate of (4) '-------'
the result should be as follows:
.-------. | A | B | |---|---| | 1 | 1 | -- (1) | 2 | 1 | -- (2) | 3 | 2 | -- (4) '-------'
Advertisement
Answer
SELECT distinct * from TableA INTO TableB; truncate TableA; insert into TableA select * from tableB; drop tableB;
Additional answers from the comments:
Establish a CTE (construct this dynamically using parameterized SQL and information schema)
;WITH cteExample (val1, val2, RN) as ( select *, ROW_NUMBER() over (PARTITION by val1, val2 order by val1, val2) as RN from tableA) -- select * from cteExample -- to verify results delete from cteExample where RN > 1 -- to delete duplicate records
Utilize distinct if you know the tables will have a primary key (as they should)
;WITH cteExample as (select distinct * from TableA) delete a from cteExample b right join TableA a on b.primaryKey = a.PrimaryKey where b.PrimaryKey is null;
Lastly, skip the CTE entirely if you want:
delete a from TableA a left join (select distinct * from TableA) b on a.PK = b.PK where b.PK is null
All of these statements would be best served constructed dynamically using a stored procedure or the equivalent of exec sp_executeSQL (if avail) in order to grab the column names without knowledge of the columns ahead of time.