Skip to content
Advertisement

Reducing granularity of historical status table?

I have a table with what are essentially historical logs for project tasks.

enter image description here

Each row contains the ID for a project and the date of when one specific task was either started or ended.

I need to reduce the grain so that all of the start/end times are in a single row (see IDs 1 and 2 in the image)

But here’s the part that’s kicking me, some rows may have duplicate task entries, but with different timestamps – meaning the user had to go back and redo a task. For these, I only need the latest timestamp for that specific column (see ID 3 in the image).

There is another table that contains the IDs for the projects that is the main table for this. I have tried using a left join, but I seem to have trouble getting just the latest timestamp.

The way I’ve been going about it so far is like this:

SELECT ID, a1.A_START, a2.A_END [...] c2.C_END FROM PROJECT
LEFT JOIN (SELECT ID, A_START FROM PROJECT_TASK ORDER BY A_START DESC) a1 ON a1.ID = p.ID
LEFT JOIN (SELECT ID, A_END FROM PROJECT_TASK ORDER BY A_END DESC)     a2 ON a2.ID = p.ID
LEFT JOIN (SELECT ID, B_START FROM PROJECT_TASK ORDER BY B_START DESC) b1 ON b1.ID = p.ID
...
LEFT JOIN (SELECT ID, C_END FROM PROJECT_TASK ORDER BY C_END DESC)     c2 ON c2.ID = p.ID

Obviously this doesn’t work as intended because I need to weed out any older tasks that were redone – which is where I’m stuck.

Any tips or advice are appreciated.

NOTE: Neither table can be altered – I can only create a new fact table from the data given.


DISCLAIMER: It’s likely that a similar question has been asked here before, but I couldn’t think of what specifically to ask that actually came up with any helpful results. If someone links to a similar post, I’ll edit this message to include that link as a reference for future users.

Advertisement

Answer

You could use aggregation on all those 6 columns. You can change column names and aliases as desired

select id, max(col1), max(col2), max(col3), max(col4), max(col5), max(col6)
from your_table
group by id;
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement