Skip to content
Advertisement

Mark existing rows as duplicates when selecting with left join in T-SQL

I am trying to do a LEFT JOIN to merge a table and view and the right table has duplicate items(usernames). I want to keep them but add another column to mark them as isDuplicate='TRUE'.

I wanted to use CASE WHEN EXISTS but can’t get it to work correctly, it produces wrong results:

SELECT vc.id AS [id]
,vc.server AS [server]
,vc.username AS [vc_username]
,s.username AS [s_username]
,s.name AS [name]
,s.email AS [email]
,CASE WHEN EXISTS (SELECT s.username WHERE vc.username = s.username)
       THEN 'TRUE'
       ELSE 'FALSE'
          END AS [isDuplicate]
FROM v_view vc
       LEFT JOIN s_table s ON vc.username = s.username

Ideally I would like to not mark the first occurence of a row as duplicate. E.g. first time username appears it is not marked as duplicate, second time it is a duplicate. If not possible then just mark all rows that don’t have unique usernames as isDuplicate='TRUE'.

Example data:

View:

id   server   username
1    server1  user1
2    server2  user2
3    server3  user3

Table:

username    name     email
user1       Bob      bob@example.com
user1       BobAdmin bobadm@example.com
user2       John     john@example.com
user3       Daniel   daniel@example.com
user4       Adam     adam@example.com

Expected result:

id  server    vc_username  s_username   name      email                isDuplicate
1   server1   user1        user1        Bob       bob@example.com      FALSE
1   server1   user1        user1        BobAdmin  bobadm@example.com   TRUE
2   server2   user2        user2        John      john@example.com     FALSE
3   server3   user3        user3        Daniel    daniel@example.com   FALSE

Alternative option if first is not possible:

id  server    vc_username  s_username   name      email                isDuplicate
1   server1   user1        user1        Bob       bob@example.com      TRUE
1   server1   user1        user1        BobAdmin  bobadm@example.com   TRUE
2   server2   user2        user2        John      john@example.com     FALSE
3   server3   user3        user3        Daniel    daniel@example.com   FALSE

Advertisement

Answer

I think you want row_number() or count(*). To tag all duplicate user names:

SELECT . . .,
       (CASE WHEN COUNT(*) OVER (PARTITION BY username) > 1
             THEN 'TRUE'
             ELSE 'FALSE'
       END) AS [isDuplicate]
FROM v_view vc LEFT JOIN
     s_table s
     ON vc.username = s.username;

You would use row_number() . . . = 1 if you wanted to identify as duplicates all but one of the duplicates.

User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement