I have a table with the below structure.
ID Entity UserName UserRole UserStatus UpdatedDate
Clustered key is defined on columns Entity, UserName, UserRole
.
UserStatus
for an Entity-UserName-UserRole
could be either Active
or Inactive
.
I could implement the INSERT
and UPDATE
statements for a dataframe
df as follows:
Insert:
values <- paste("('",df$Entity,"','", df$UserName,"','", ,df$UserRole,"','" ,df$UserStatus,"','" , df$UpdatedDate"')", sep="", collapse=",") sqlStr_Insert<- sqlQuery(con, sprintf(paste("Insert into TBL (Entity, UserName,UserRole,UserStatus,UpdatedDate) values ",values)))
Update:
sql_string <- "UPDATE TBL SET UserStatus = '%s', UpdatedDate = '%s' WHERE Entity='%s' AND UserName ='%s' AND UserRole = '%s';" sqlStr_update <-sqlQuery(con, paste(sprintf(sql_string, UserStatus, UpdatedDate,df$Entity,df$UserName,df$UserRole), collapse=""))
The code works fine but I would like it to check if for every row in the df for a combination of Entity-UserName-UserRole already exists in the TBL. If found, run the UPDATE statement else run the INSERT statement.
For example – SQL Server table data:
ID Entity UserName UserRole UserStatus UpdatedDate --------------------------------------------------- 1 A Ford Analyst Active 1/1/2020 2 B Acura Analyst Active 1/5/2020 3 A Tesla Engineer Inactive 1/6/2020 4 A Kia Analyst Active 1/1/2020
df to be inserted or updated:
Entity UserName UserRole UserStatus UpdatedDate --------------------------------------------------- A Accord Analyst Active 1/10/2020 B Acura Analyst Active 1/10/2020 C BMW Engineer Active 1/10/2020
What should happen:
Only rows 1st and 3rd should get ‘INSERTED
‘ to the database TBL. UPDATE
should only happen if the ‘UserStatus
‘ changes for an existing Entity-UserName-UserRole
Combination.
I guess I could run a ‘FOR’ loop for every row of the dataframe df but I think that would be slow and not an optimum solution.
Advertisement
Answer
You could create a temporary table (use #
prefix in SQL Server
) and send a MERGE
query as suggested by @CharlieFace:
data.new <- read.table(text=' Entity UserName UserRole UserStatus UpdatedDate A Accord Analyst Active 1/10/2020 B Acura Analyst Active 1/10/2020 C BMW Engineer Active 1/10/2020', header = T) library(DBI) conn <- dbConnect(...) # Create temporary table dbWriteTable(conn,'#NEWDATA',data.new) dbExecute(conn," MERGE INTO TBL WITH (HOLDLOCK) AS target USING #NEWDATA AS source ON target.Entity = source.Entity AND target.UserName = source.UserName AND target.UserRole = source.UserRole WHEN MATCHED THEN UPDATE SET target.UserStatus = source.UserStatus WHEN NOT MATCHED BY TARGET THEN INSERT (Entity,UserName,UserRole,UserStatus,UpdatedDate) VALUES (Entity,UserName,UserRole,UserStatus,UpdatedDate); " ) # Remove temporary table dbExecute(conn,"DROP TABLE #NEWDATA" ) # Check results result <- dbGetQuery(conn,"SELECT * from TBL") result ID Entity UserName UserRole UserStatus UpdatedDate 1 1 A Ford Analyst Active 1/1/2020 2 2 B Acura Analyst Active 1/5/2020 3 3 A Tesla Engineer Inactive 1/6/2020 4 4 A Kia Analyst Active 1/1/2020 5 NA A Accord Analyst Active 1/10/2020 6 NA C BMW Engineer Active 1/10/2020
Not sure how ID
is managed on the server : autonumber?