Skip to content
Advertisement

How to carry over latest observed record when grouping by on SQL?

(I’ve created a similar question before, but I messed it up beyond repair. Hopefully, I can express myself better this time.)

I have a table containing records that change through time, each row representing a modification in Stage and Amount. I need to group these records by Day and Stage, summing up the Amount.

The tricky part is: ids might not change in some days. Since there won’t be any record in those days, so I need to carry over the latest record observed.

Find below the records table and the expected result. MRE on dbfiddle (PostgreSQL)

Records

Records

Expected Result

Expected Result

I created this basic visualization to demonstrate how the Amounts and Stages change throughout the days. Each number/color change represents a modification.

Visualization

The logic behind the expected result can be found below.

Total Amount by Stage on Day 2

Id A was modified on Day 2, let’s take that Amount: Negotiation 60.

Id B wasn’t modified on Day 2, so we carry over the most recent modification (Day 1): Open 10.

Open 10

Negotiation 60

Closed 0

Total Amount by Stage on Day 3

Id A wasn’t modified on Day 3, so we carry over the most recent modification (Day 2): Negotiation 60.

Id A was modified on Day 3: Negotiation 30

Total Amount by Stage on Day 3

Open 0

Negotiation 90

Closed 0

Advertisement

Answer

Basically, you seem to want the most recent value for each id — and it only gets counted for the most recent stage.

You can get this using a formulation like this:

select d.DateDay, s.stage, coalesce(sh.amount, 0)
from (select distinct sh.DateDay from stage_history sh) d cross join
     (select distinct sh.stage from stage_history sh) s left join lateral
     (select sum(sh.amount) as amount
      from (select distinct on (sh.id) sh.*
            from stage_history sh
            where sh.DateDay <= d.DateDay 
            order by sh.id, sh.DateDay desc
           ) sh
      where sh.stage = s.stage
     ) sh
     on 1=1
order by d.DateDay, s.stage;

Here is a db<>fiddle.

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement