Skip to content
Advertisement

Data Architecture Question – Storing Daily Snapshot of Employee Master Data

I’m trying to store daily snapshot of Employee data for FTE analysis project – analytics on how many FTE in various positions any given day.

I can call a REST API, which will give me data for all the active and terminated employees as of API call time. Is it prudent to call this API every single day and store daily snapshot or store only records which changed from previous version. What is the common design principle for this use case. Thanks!

Advertisement

Answer

I ended up using Delta tables on Databricks.

Sample code to create the table and load data

Now you can query data of any given day using below query

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement