Skip to content
Advertisement

How to update several tables with a result query?

I am working with SQL Server 2017, and I need to clean up duplicate rows and update all rows in other tables that contain my field.

I’ve got one table which contains my customers

USERID                                - Username
C79784F1-7254-4195-AF7F-66E651F3C995  | Robert
3C51AD27-21F1-4751-9931-7C66263B4708  | Robert
0D67A3E3-E7CF-4D95-935D-E077F4A6D315  | Bob
70A9552A-028B-4EA0-A309-4E93EEAB92E8  | William
1D8E9F5D-FEEB-43DA-9CDA-F22D610CDE78  | William
411BCC56-A4C9-4D9B-9D49-FA9255ECA968  | William
F0223C57-E3B2-4F94-9820-2D9A62A515D6  | Cathy


CREATE TABLE [dbo].[Users]
(
    [UserID] [uniqueidentifier] NOT NULL,
    [UserName] [nvarchar](260) NULL 
);

INSERT INTO [dbo].[Users] (userid, username) 
VALUES ('C79784F1-7254-4195-AF7F-66E651F3C995','Robert');
INSERT INTO [dbo].[Users] (userid, username) 
VALUES ('3C51AD27-21F1-4751-9931-7C66263B4708','Robert');
INSERT INTO [dbo].[Users] (userid, username) 
VALUES ('0D67A3E3-E7CF-4D95-935D-E077F4A6D315','Bob');
INSERT INTO [dbo].[Users] (userid, username) 
VALUES ('70A9552A-028B-4EA0-A309-4E93EEAB92E8','William');
INSERT INTO [dbo].[Users] (userid, username) 
VALUES ('1D8E9F5D-FEEB-43DA-9CDA-F22D610CDE78','William');
INSERT INTO [dbo].[Users] (userid, username) 
VALUES ('411BCC56-A4C9-4D9B-9D49-FA9255ECA968','William');
INSERT INTO [dbo].[Users] (userid, username) 
VALUES ('F0223C57-E3B2-4F94-9820-2D9A62A515D6','Cathy');

Then I have 7 tables that contains the userid column and 1 table with another name column

               CreatedById              -  CreationDate - Folders
C79784F1-7254-4195-AF7F-66E651F3C995    | 2018-02-24    | Folder1
3C51AD27-21F1-4751-9931-7C66263B4708    | 2019-10-12    | PAD
0D67A3E3-E7CF-4D95-935D-E077F4A6D315    | 2021-05-12    | IEF
70A9552A-028B-4EA0-A309-4E93EEAB92E8    | 2021-01-27    | WIP
1D8E9F5D-FEEB-43DA-9CDA-F22D610CDE78    | 2021-06-29    | OLD_ONE
411BCC56-A4C9-4D9B-9D49-FA9255ECA968    | 2021-01-21    | ToTest

CREATE TABLE [dbo].[catalog] 
(
    [CreatedById] [uniqueidentifier] NOT NULL,
    [CreationDate] DATE NOT NULL, 
    [Folders] [nvarchar](425)
);

INSERT INTO [dbo].[catalog] (CreatedById, CreationDate, Folders) 
VALUES ('C79784F1-7254-4195-AF7F-66E651F3C995','2018-02-24','Folder1');
INSERT INTO [dbo].[catalog] (CreatedById, CreationDate, Folders) 
VALUES ('3C51AD27-21F1-4751-9931-7C66263B4708','2019-10-12','PAD');
INSERT INTO [dbo].[catalog] (CreatedById, CreationDate, Folders) 
VALUES ('0D67A3E3-E7CF-4D95-935D-E077F4A6D315','2021-05-12','IEF');
INSERT INTO [dbo].[catalog] (CreatedById, CreationDate, Folders) 
VALUES ('70A9552A-028B-4EA0-A309-4E93EEAB92E8','2021-01-27','WIP');
INSERT INTO [dbo].[catalog] (CreatedById, CreationDate, Folders) 
VALUES ('1D8E9F5D-FEEB-43DA-9CDA-F22D610CDE78','2021-06-29','OLD_ONE');
INSERT INTO [dbo].[catalog] (CreatedById, CreationDate, Folders) 
VALUES ('411BCC56-A4C9-4D9B-9D49-FA9255ECA968','2021-01-21','ToTest');

My other tables:

CREATE TABLE table3 ([USERID] [uniqueidentifier] NOT NULL);
CREATE TABLE table4 ([USERID] [uniqueidentifier] NOT NULL);
CREATE TABLE table5 ([USERID] [uniqueidentifier] NOT NULL);
CREATE TABLE table6 ([USERID] [uniqueidentifier] NOT NULL);


INSERT INTO table3 (USERID) VALUES ('C79784F1-7254-4195-AF7F-66E651F3C995');
INSERT INTO table3 (USERID) VALUES ('3C51AD27-21F1-4751-9931-7C66263B4708');
INSERT INTO table3 (USERID) VALUES ('0D67A3E3-E7CF-4D95-935D-E077F4A6D315');
INSERT INTO table3 (USERID) VALUES ('70A9552A-028B-4EA0-A309-4E93EEAB92E8');
INSERT INTO table3 (USERID) VALUES ('1D8E9F5D-FEEB-43DA-9CDA-F22D610CDE78');
INSERT INTO table3 (USERID) VALUES ('411BCC56-A4C9-4D9B-9D49-FA9255ECA968');

INSERT INTO table4 (USERID) VALUES ('C79784F1-7254-4195-AF7F-66E651F3C995');
INSERT INTO table4 (USERID) VALUES ('3C51AD27-21F1-4751-9931-7C66263B4708');
INSERT INTO table4 (USERID) VALUES ('0D67A3E3-E7CF-4D95-935D-E077F4A6D315');
INSERT INTO table4 (USERID) VALUES ('70A9552A-028B-4EA0-A309-4E93EEAB92E8');
INSERT INTO table4 (USERID) VALUES ('1D8E9F5D-FEEB-43DA-9CDA-F22D610CDE78');
INSERT INTO table4 (USERID) VALUES ('411BCC56-A4C9-4D9B-9D49-FA9255ECA968');

INSERT INTO table5 (USERID) VALUES ('C79784F1-7254-4195-AF7F-66E651F3C995');
INSERT INTO table5 (USERID) VALUES ('3C51AD27-21F1-4751-9931-7C66263B4708');
INSERT INTO table5 (USERID) VALUES ('0D67A3E3-E7CF-4D95-935D-E077F4A6D315');
INSERT INTO table5 (USERID) VALUES ('70A9552A-028B-4EA0-A309-4E93EEAB92E8');
INSERT INTO table5 (USERID) VALUES ('1D8E9F5D-FEEB-43DA-9CDA-F22D610CDE78');
INSERT INTO table5 (USERID) VALUES ('411BCC56-A4C9-4D9B-9D49-FA9255ECA968');

INSERT INTO table6 (USERID) VALUES ('C79784F1-7254-4195-AF7F-66E651F3C995');
INSERT INTO table6 (USERID) VALUES ('3C51AD27-21F1-4751-9931-7C66263B4708');
INSERT INTO table6 (USERID) VALUES ('0D67A3E3-E7CF-4D95-935D-E077F4A6D315');
INSERT INTO table6 (USERID) VALUES ('70A9552A-028B-4EA0-A309-4E93EEAB92E8');
INSERT INTO table6 (USERID) VALUES ('1D8E9F5D-FEEB-43DA-9CDA-F22D610CDE78');
INSERT INTO table6 (USERID) VALUES ('411BCC56-A4C9-4D9B-9D49-FA9255ECA968');

I want to clean the duplicates and keep only one record in the database.

First, I created a query that gives me only the duplicate rows and keeps only one record.

With this record, I’ll update table3, table4, table5, table6,

WITH singleUser AS
(
    SELECT 
        a.UserName,
        a.UserID
    FROM
        (SELECT
             userid,
             Username,
             ROW_NUMBER() OVER (PARTITION BY username ORDER BY username ASC) AS rowNo,
             COUNT(*) OVER (PARTITION BY username) AS c
         FROM
             dbo.users
         WHERE
             1 = 1 
         GROUP BY
             userid, Username) a
WHERE 
    1 = 1
    AND rowNo > 1
    AND c = rowNo
)

Then I created a query that gives me all the tables that contain my ‘Userid’ column.

This query will return: table3, table4, table5, table6

WITH tableToUpdate AS
(
    SELECT  
        TABLE_CATALOG   AS 'Bdd',
        TABLE_SCHEMA    AS 'Schema',
        TABLE_NAME      AS 'TableName',
        COLUMN_NAME     AS 'ColumnName'
    FROM 
        INFORMATION_SCHEMA.COLUMNS
    WHERE 
        1 = 1
        AND CASE 
                WHEN COLUMN_NAME = 'CreatedByID' THEN 1 
                WHEN COLUMN_NAME = 'UserID' THEN 1
                ELSE 0
            END = 1
)

And finally I created my merge query

MERGE INTO dbo.catalog c
USING (SELECT
           u.UserID AS UserIDUsers,
           su.UserID AS UserIDSingleUser
       FROM 
           dbo.Users u
       JOIN 
           singleUser su ON su.Username = u.username
       WHERE 
           1 = 1) S ON c.CreatedByID = s.UserIDUsers

WHEN MATCHED THEN
    UPDATE
        SET c.CreatedByID =S.UserIDSingleUser

My merge result:

               CreatedById              -  CreationDate - Folders
C79784F1-7254-4195-AF7F-66E651F3C995    | 2018-02-24    | Folder1
C79784F1-7254-4195-AF7F-66E651F3C995    | 2019-10-12    | PAD
0D67A3E3-E7CF-4D95-935D-E077F4A6D315    | 2021-05-12    | IEF
70A9552A-028B-4EA0-A309-4E93EEAB92E8    | 2021-01-27    | WIP
70A9552A-028B-4EA0-A309-4E93EEAB92E8    | 2021-06-29    | OLD_ONE
70A9552A-028B-4EA0-A309-4E93EEAB92E8    | 2021-01-21    | ToTest

It works very well, but is there a way to automatize it ? Actually I’ve created 8 queries, but only the merge section change. Also, how can I remove duplicate rows in my dbo.users table, after all fields have been updated?

Thank you for your help.

Advertisement

Answer

I came back to answer to my own question. After some days i finally did it.

beforehand I’ve created a table which comes from my CTE query (singleUser)

CREATE OR ALTER PROCEDURE dbo.mergeUserID
AS
DECLARE @tableName nvarchar(50)
DECLARE @sql nvarchar(max)
DECLARE @columnName nvarchar(50)

BEGIN
DECLARE cursor_db CURSOR FOR
SELECT
    TABLE_NAME      AS  'TableName'
    ,COLUMN_NAME    AS  'ColumnName'
FROM INFORMATION_SCHEMA.COLUMNS
WHERE 1=1
    AND CASE 
            WHEN COLUMN_NAME = 'CreatedByID' then 1 
            WHEN COLUMN_NAME = 'ModifiedByID' then 1 
            WHEN COLUMN_NAME = 'OwnerID'then 1
            WHEN COLUMN_NAME = 'UserID' then 1
        ELSE 0
        END = 1

OPEN cursor_db

FETCH NEXT FROM cursor_db INTO @tableName, @columnName
WHILE @@FETCH_STATUS = 0
BEGIN
SET @sql ='MERGE INTO ' 
            + @tablename+ ' t USING (
        SELECT
            u.UserID    as UserIDUsers
            ,su.UserID  as UserIDSingleUser
        FROM dbo.Users u
            JOIN dbo.singleUser  su on su.UserName  = u.username
        WHERE 1=1
        )S ON t.'+@columnName+' = s.UserIDUsers
        WHEN MATCHED THEN
        UPDATE
            SET t.'+@columnName+' = S.UserIDSingleUser;'

exec sp_executesql @sql
PRINT @sql

FETCH NEXT FROM cursor_db INTO @tableName, @columnName

END         
CLOSE cursor_db
DEALLOCATE cursor_db
END;
GO
------------------------------------------------
DECLARE @RC nvarchar(max)

-- TODO: Set parameter values here.

EXECUTE @RC = [dbo].[mergeUserID] 
PRINT @RC
GO

I don’t know if it’s well coded because it’s the first time I’ve done this. For example I’ve seen on some forum they put the ; after FETCH / CLOSE / DEALLOCATE ; others not.

with semicolon

Microsoft without semicolon

so Who is right or wrong, dunno ?

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement