I want to find all duplicates in a column and display the primary key so I can clean them up.
So after a bunch of tutorials I have this which is supposed to work and returns no rows which is great because it means there’s nothing I need to clean up. ‘equipment_id’ is the primary key on the table
SELECT equipment_id, tent_id, COUNT(*) FROM equipment GROUP BY equipment_id, tent_id HAVING COUNT(*) > 1
But when I just look for the dupes like this without involving the primary key column
SELECT tent_id, COUNT(*) FROM equipment GROUP BY tent_id HAVING COUNT(*) > 1
I get 900+ results, which is also fine but I need the primary key so I know which ones to edit.
TABLE equipment ROWS equipment_id tent type tent_id note 100 large 66666 out 101 small 66666 in 102 small 55555 in 103 large 44444 in 104 small 33333 out
The result I want will look like
100 66666 101 66666
Advertisement
Answer
If the first query be returning any records and the equipment_id
really be a primary key column, then it must be the case that a single tent_id
value is associated with multiple records/multiple equipment_id
values. To find this tent_id
values, try asserting the distinct count of the equipment_id
:
SELECT tent_id FROM equipment GROUP BY tent_id HAVING COUNT(equipment_id) > 1;
To generate the output you want, you could try:
WITH cte AS ( SELECT equipment_id, tent_id, COUNT(*) OVER (PARTITION BY tent_id) cnt FROM equipment ) SELECT equipment_id, tent_id FROM cte WHERE cnt > 1;