I have the following data set for an user. I want to merge the False flag rows to the latest True row.
initial set:
x
user input_date starting_balance deposit flag
user1 2020-08-20 100 20 TRUE
user1 2020-08-21 200 0 TRUE
user1 2020-08-23 null 200 FALSE
user1 2020-08-24 300 130 TRUE
user1 2020-08-25 400 0 TRUE
output:
user input_date starting_balance deposit flag
user1 2020-08-20 100 20 TRUE
user1 2020-08-21 200 0 TRUE
user1 2020-08-24 300 330 TRUE
user1 2020-08-25 400 0 TRUE
sql to recreate data:
select 'user1' as user, '2020-08-20' as input_date, 100 as starting_balance, 20 as deposit, true as flag
union all
select 'user1', '2020-08-21', 200, 0, true
union all
select 'user1', '2020-08-23', null, 200, false
union all
select 'user1', '2020-08-24', 300, 130, true
union all
select 'user1', '2020-08-25', 400, 0, true;
Advertisement
Answer
This is an aggregation problem. You can use a cumulative sum to assign groups. And then aggregate. The group is the number of “trues” up to each row. So:
select user, max(input_date), sum(starting_balance),
sum(deposit), true as flag
from (select t.*,
sum(case when flag then 1 else 0 end) over (partition by user order by input_date desc) as grp
from t
) t
group by user, grp
order by user, input_date;