Skip to content
Advertisement

Using SQL Merge or UPDATE / INSERT

I have a table (Customer_Master_File) that needs to updated from flat files dumped into a folder. I have an SSIS package that runs to pick up the flat files and imports them into a temp table (temp_Customer_Master_File)

What I have been unable to do is this:

for each record in the temp table, if the Customer_Number exists in the Master table, update it, if not insert the contents of the temp table.

I’m updating all fields of the record, not looking for individual field changes.

I tried the SQL Merge function but it errors when there is more than one record in the source data.

The flat files contain changes to the customer record, and there could be more than one change at a time. I just want to process each record with inserting or updating as necessary.

I also tried doing an INSERT INTO MASTER_FILE FROM TEMP_TABLE WHERE CUSTOMER_NUMBER NOT IN MASTER_FILE but this also fails with a PK error when it hits a duplicate source row.

Advertisement

Answer

UPDATE m SET 
  col2 = t.col2, 
  col3 = t.col3 -- etc. - all columns except Customer_Number
FROM dbo.Master_File AS m
INNER JOIN 
(
  SELECT 
    Customer_Number, rn = ROW_NUMBER() OVER
    (
      PARTITION BY Customer_Number ORDER BY [timestamp_column] DESC
    ), col2, col3, ... etc ...
  FROM dbo.Temp_Table
) AS t
ON m.Customer_Number = t.Customer_Number
WHERE t.rn = 1;

INSERT dbo.Master_File(Customer_Number, col2, col3, ...etc...)
  SELECT Customer_Number, col2, col3, ...etc...
  FROM 
  (
    SELECT 
      Customer_Number, rn = ROW_NUMBER() OVER 
      (
        PARTITION BY Customer_Number ORDER BY [timestamp_column DESC
      ),
      col2, col3, ...etc...
    FROM dbo.Temp_Table AS t 
    WHERE NOT EXISTS 
    (
      SELECT 1 FROM dbo.Master_File AS m
      WHERE m.Customer_Number = t.Customer_Number
    )
  ) AS x WHERE rn = 1;

This takes care of multiple rows in the source table that don’t already exist in the destination. I’ve made an assumption about column names which you’ll have to adjust.

MERGE may be tempting, however there are a few reasons I shy away from it:

  1. the syntax is daunting and hard to memorize…
  2. you don’t get any more concurrency than the above approach unless you intentionally add specific locking hints
  3. there are many unresolved bugs with MERGE and probably many more that have yet to be uncovered…

I recently published a cautionary tip here as well and have collected some other opinions here.

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement