Please help me with this one, I’m stuck and cant figure out how to write my Query. I’m working with SQL Server 2014.
Table A (approx 65k ROWS) CEID = primary key
CEID State Checksum 1 2 666 2 2 666 3 2 666 4 2 333 5 2 333 6 9 333 7 9 111 8 9 111 9 9 741 10 2 656
Desired output
CEID State Checksum 3 2 666 6 9 333 8 9 111 9 9 741 10 2 656
I want to keep the row with highest CEID if “state” is equal for all duplicate checksums. If state differs but Checksum is equal i want to keep the row with highest CEID for State=9. Unique rows like CEID 9 and 10 should be included in result regardless of State.
This join returns all duplicates:
SELECT a1.*, a2.* FROM tableA a1 INNER JOIN tableA a2 ON a1.ChecksumI = a2.ChecksumI AND a1.CEID <> a2.CEID
I’ve also identified MAX(CEID)
for each duplicate checksum with this query
SELECT a.Checksum, a.State, MAX(a.CEID) CEID_MAX ,COUNT(*) cnt FROM tableA a GROUP BY a.Checksum, a.State HAVING COUNT(*) > 1 ORDER BY a.Checksum, a.State
With the first query, I can’t figure out how to SELECT
the row with the highest CEID
per Checksum.
The problem I encounter with last one is that GROUP BY isn’t allowed in subqueries when I try to join on it.
Advertisement
Answer
You can use row_number()
with partition by checksum
and order by State desc
and CEID desc
. Note both of your condition may be fulfill by ORDER BY State desc, CEID desc
And take the first row_number
;with cte as ( select *, rn = row_number() over (Partition by Checksum order by State desc, CEID desc) from TableA ) select * from cte where rn = 1 order by CEID;