Creating a VIEW to remove duplicates from table based on max date

Question

I have a table that appends data each day and records the imported date, however it appends duplicates. My end goal here is to remove the duplicates based on the lowest imported column date. Here would be the initial state of that table: TABLE CLIENTS Name Surname Imported Bob John 18-07-2022 Marta White 18-0…

Accepted Answer

As mentioned in my comment, ideally each client would have a unique identifier. Lacking that, I am going to use name||&#8217;_&#8217;||surname as a pseudo primary key.There&#8217;s a couple of approaches you could use hereThe first is using a subquery to join on the key and imported dateCREATE VIEW CLIENTS_VIEW ASSELECT C.* FROM CLIENTS CJOIN    (SELECT name||'_'||surname as client_name, MAX(imported) as latestFROM CLIENTSGROUP BY 1     ) MI ON MI.client_name = C.name||'_'||surname AND MI.latest = C.importedAnother would be to use a row number function as per the other answerCREATE VIEW CLIENTS_VIEW ASSELECT C.* FROM CLIENTS CQUALIFY row_number() over (partition by Name, surname order by imported desc)=1In my experience, the subquery one is more performant if the amount of data is largeThere is other alternatives, for example using NOT EXISTS, joining back onto the same table or using a CTEThe most performant option for larger tables would be to create another table with the latest data for each client (again a unique identifier would be needed) and periodically use MERGE to upsert new data.Something like thismerge into clients_latest cl using (select * from clients) as c on cl.name||'_'||surname = c.name||'_'||surnamewhen matched then update set cl.imported = c.importedwhen not matched then insert (name, surname, imported) values (c.name, c.surname, c.imported);If this data is changed infrequently then a semi regular scheduled task could run this for you. If the table is constantly being appended then an append only table stream might be a quicker option as you would only then be upserting the new data since the last upsert

Name	Surname	Imported
Bob	John	18-07-2022
Marta	White	18-07-2022
Ryan	Max	18-07-2022
Bob	John	20-07-2022
Marta	White	20-07-2022
Ryan	Max	20-07-2022
Brian	Red	20-07-2022

Advertisement

Answer