I am trying to do a LEFT JOIN
to merge a table and view and the right table has duplicate items(usernames). I want to keep them but add another column to mark them as isDuplicate='TRUE'
.
I wanted to use CASE WHEN EXISTS
but can’t get it to work correctly, it produces wrong results:
SELECT vc.id AS [id] ,vc.server AS [server] ,vc.username AS [vc_username] ,s.username AS [s_username] ,s.name AS [name] ,s.email AS [email] ,CASE WHEN EXISTS (SELECT s.username WHERE vc.username = s.username) THEN 'TRUE' ELSE 'FALSE' END AS [isDuplicate] FROM v_view vc LEFT JOIN s_table s ON vc.username = s.username
Ideally I would like to not mark the first occurence of a row as duplicate. E.g. first time username appears it is not marked as duplicate, second time it is a duplicate. If not possible then just mark all rows that don’t have unique usernames as isDuplicate='TRUE'
.
Example data:
View:
id server username 1 server1 user1 2 server2 user2 3 server3 user3
Table:
username name email user1 Bob bob@example.com user1 BobAdmin bobadm@example.com user2 John john@example.com user3 Daniel daniel@example.com user4 Adam adam@example.com
Expected result:
id server vc_username s_username name email isDuplicate 1 server1 user1 user1 Bob bob@example.com FALSE 1 server1 user1 user1 BobAdmin bobadm@example.com TRUE 2 server2 user2 user2 John john@example.com FALSE 3 server3 user3 user3 Daniel daniel@example.com FALSE
Alternative option if first is not possible:
id server vc_username s_username name email isDuplicate 1 server1 user1 user1 Bob bob@example.com TRUE 1 server1 user1 user1 BobAdmin bobadm@example.com TRUE 2 server2 user2 user2 John john@example.com FALSE 3 server3 user3 user3 Daniel daniel@example.com FALSE
Advertisement
Answer
I think you want row_number()
or count(*)
. To tag all duplicate user names:
SELECT . . ., (CASE WHEN COUNT(*) OVER (PARTITION BY username) > 1 THEN 'TRUE' ELSE 'FALSE' END) AS [isDuplicate] FROM v_view vc LEFT JOIN s_table s ON vc.username = s.username;
You would use row_number() . . . = 1
if you wanted to identify as duplicates all but one of the duplicates.