Skip to content
Advertisement

Reducing granularity of historical status table?

I have a table with what are essentially historical logs for project tasks.

enter image description here

Each row contains the ID for a project and the date of when one specific task was either started or ended.

I need to reduce the grain so that all of the start/end times are in a single row (see IDs 1 and 2 in the image)

But here’s the part that’s kicking me, some rows may have duplicate task entries, but with different timestamps – meaning the user had to go back and redo a task. For these, I only need the latest timestamp for that specific column (see ID 3 in the image).

There is another table that contains the IDs for the projects that is the main table for this. I have tried using a left join, but I seem to have trouble getting just the latest timestamp.

The way I’ve been going about it so far is like this:

Obviously this doesn’t work as intended because I need to weed out any older tasks that were redone – which is where I’m stuck.

Any tips or advice are appreciated.

NOTE: Neither table can be altered – I can only create a new fact table from the data given.


DISCLAIMER: It’s likely that a similar question has been asked here before, but I couldn’t think of what specifically to ask that actually came up with any helpful results. If someone links to a similar post, I’ll edit this message to include that link as a reference for future users.

Advertisement

Answer

You could use aggregation on all those 6 columns. You can change column names and aliases as desired

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement