I have a table with the below structure.
ID Entity UserName UserRole UserStatus UpdatedDate
Clustered key is defined on columns Entity, UserName, UserRole
.
UserStatus
for an Entity-UserName-UserRole
could be either Active
or Inactive
.
I could implement the INSERT
and UPDATE
statements for a dataframe
df as follows:
Insert:
values <- paste("('",df$Entity,"','", df$UserName,"','",
,df$UserRole,"','" ,df$UserStatus,"','" ,
df$UpdatedDate"')", sep="", collapse=",")
sqlStr_Insert<- sqlQuery(con, sprintf(paste("Insert into TBL
(Entity, UserName,UserRole,UserStatus,UpdatedDate) values ",values)))
Update:
sql_string <- "UPDATE TBL SET UserStatus = '%s', UpdatedDate = '%s' WHERE Entity='%s' AND UserName ='%s' AND UserRole = '%s';"
sqlStr_update <-sqlQuery(con, paste(sprintf(sql_string, UserStatus, UpdatedDate,df$Entity,df$UserName,df$UserRole), collapse=""))
The code works fine but I would like it to check if for every row in the df for a combination of Entity-UserName-UserRole already exists in the TBL. If found, run the UPDATE statement else run the INSERT statement.
For example – SQL Server table data:
ID Entity UserName UserRole UserStatus UpdatedDate
---------------------------------------------------
1 A Ford Analyst Active 1/1/2020
2 B Acura Analyst Active 1/5/2020
3 A Tesla Engineer Inactive 1/6/2020
4 A Kia Analyst Active 1/1/2020
df to be inserted or updated:
Entity UserName UserRole UserStatus UpdatedDate
---------------------------------------------------
A Accord Analyst Active 1/10/2020
B Acura Analyst Active 1/10/2020
C BMW Engineer Active 1/10/2020
What should happen:
Only rows 1st and 3rd should get ‘INSERTED
‘ to the database TBL. UPDATE
should only happen if the ‘UserStatus
‘ changes for an existing Entity-UserName-UserRole
Combination.
I guess I could run a ‘FOR’ loop for every row of the dataframe df but I think that would be slow and not an optimum solution.
Advertisement
Answer
You could create a temporary table (use #
prefix in SQL Server
) and send a MERGE
query as suggested by @CharlieFace:
data.new <- read.table(text='
Entity UserName UserRole UserStatus UpdatedDate
A Accord Analyst Active 1/10/2020
B Acura Analyst Active 1/10/2020
C BMW Engineer Active 1/10/2020', header = T)
library(DBI)
conn <- dbConnect( )
# Create temporary table
dbWriteTable(conn,'#NEWDATA',data.new)
dbExecute(conn,"
MERGE INTO TBL WITH (HOLDLOCK) AS target
USING #NEWDATA AS source
ON target.Entity = source.Entity
AND target.UserName = source.UserName
AND target.UserRole = source.UserRole
WHEN MATCHED THEN
UPDATE SET target.UserStatus = source.UserStatus
WHEN NOT MATCHED BY TARGET THEN
INSERT (Entity,UserName,UserRole,UserStatus,UpdatedDate)
VALUES (Entity,UserName,UserRole,UserStatus,UpdatedDate);
" )
# Remove temporary table
dbExecute(conn,"DROP TABLE #NEWDATA" )
# Check results
result <- dbGetQuery(conn,"SELECT * from TBL")
result
ID Entity UserName UserRole UserStatus UpdatedDate
1 1 A Ford Analyst Active 1/1/2020
2 2 B Acura Analyst Active 1/5/2020
3 3 A Tesla Engineer Inactive 1/6/2020
4 4 A Kia Analyst Active 1/1/2020
5 NA A Accord Analyst Active 1/10/2020
6 NA C BMW Engineer Active 1/10/2020
Not sure how ID
is managed on the server : autonumber?