I have a table with the below structure.
ID Entity UserName UserRole UserStatus UpdatedDate
Clustered key is defined on columns Entity, UserName, UserRole.
UserStatus for an Entity-UserName-UserRole could be either Active or Inactive.
I could implement the INSERT and UPDATE statements for a dataframe df as follows:
Insert:
values <- paste("('",df$Entity,"','", df$UserName,"','",
,df$UserRole,"','" ,df$UserStatus,"','" ,
df$UpdatedDate"')", sep="", collapse=",")
sqlStr_Insert<- sqlQuery(con, sprintf(paste("Insert into TBL
(Entity, UserName,UserRole,UserStatus,UpdatedDate) values ",values)))
Update:
sql_string <- "UPDATE TBL SET UserStatus = '%s', UpdatedDate = '%s' WHERE Entity='%s' AND UserName ='%s' AND UserRole = '%s';" sqlStr_update <-sqlQuery(con, paste(sprintf(sql_string, UserStatus, UpdatedDate,df$Entity,df$UserName,df$UserRole), collapse=""))
The code works fine but I would like it to check if for every row in the df for a combination of Entity-UserName-UserRole already exists in the TBL. If found, run the UPDATE statement else run the INSERT statement.
For example – SQL Server table data:
ID Entity UserName UserRole UserStatus UpdatedDate --------------------------------------------------- 1 A Ford Analyst Active 1/1/2020 2 B Acura Analyst Active 1/5/2020 3 A Tesla Engineer Inactive 1/6/2020 4 A Kia Analyst Active 1/1/2020
df to be inserted or updated:
Entity UserName UserRole UserStatus UpdatedDate --------------------------------------------------- A Accord Analyst Active 1/10/2020 B Acura Analyst Active 1/10/2020 C BMW Engineer Active 1/10/2020
What should happen:
Only rows 1st and 3rd should get ‘INSERTED‘ to the database TBL. UPDATE should only happen if the ‘UserStatus‘ changes for an existing Entity-UserName-UserRole Combination.
I guess I could run a ‘FOR’ loop for every row of the dataframe df but I think that would be slow and not an optimum solution.
Advertisement
Answer
You could create a temporary table (use # prefix in SQL Server) and send a MERGE query as suggested by @CharlieFace:
data.new <- read.table(text='
Entity UserName UserRole UserStatus UpdatedDate
A Accord Analyst Active 1/10/2020
B Acura Analyst Active 1/10/2020
C BMW Engineer Active 1/10/2020', header = T)
library(DBI)
conn <- dbConnect(...)
# Create temporary table
dbWriteTable(conn,'#NEWDATA',data.new)
dbExecute(conn,"
MERGE INTO TBL WITH (HOLDLOCK) AS target
USING #NEWDATA AS source
ON target.Entity = source.Entity
AND target.UserName = source.UserName
AND target.UserRole = source.UserRole
WHEN MATCHED THEN
UPDATE SET target.UserStatus = source.UserStatus
WHEN NOT MATCHED BY TARGET THEN
INSERT (Entity,UserName,UserRole,UserStatus,UpdatedDate)
VALUES (Entity,UserName,UserRole,UserStatus,UpdatedDate);
" )
# Remove temporary table
dbExecute(conn,"DROP TABLE #NEWDATA" )
# Check results
result <- dbGetQuery(conn,"SELECT * from TBL")
result
ID Entity UserName UserRole UserStatus UpdatedDate
1 1 A Ford Analyst Active 1/1/2020
2 2 B Acura Analyst Active 1/5/2020
3 3 A Tesla Engineer Inactive 1/6/2020
4 4 A Kia Analyst Active 1/1/2020
5 NA A Accord Analyst Active 1/10/2020
6 NA C BMW Engineer Active 1/10/2020
Not sure how ID is managed on the server : autonumber?