Skip to content
Advertisement

SQL – Finding Duplicate Records based certain criteria

I have these records in the table – employee_projects

id employee_id project_id status
1 emp1 proj1 VERIFIED
2 emp2 proj2 REJECTED
3 emp1 proj1 VERIFIED
4 emp1 proj3 REJECTED
5 emp2 proj2 REQUIRED
6 emp3 proj4 SUBMITTED
7 emp4 proj5 VERIFIED
8 emp4 proj6 VERIFIED
9 emp3 proj4 REQUIRED

Here are the criteria for determining duplicates:

  1. Same employee ID, same project ID under the same status (Example: rows 1 and 3 are duplicates)
  2. Same employee ID, same project ID but in different status (Example: rows 6 and 9 are duplicates). An exception to duplication criteria#2 is if one project is REQUIRED and the same project is also REJECTED under the same employee, this is NOT considered a duplicate. For example, rows 2 and 5 are NOT duplicates.

I have a query for the first criterion:

What I’m struggling to construct is the SQL for the second criterion.

Advertisement

Answer

maybe a self join can help you.

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement