I have the following data set for an user. I want to merge the False flag rows to the latest True row.
initial set:
user input_date starting_balance deposit flag user1 2020-08-20 100 20 TRUE user1 2020-08-21 200 0 TRUE user1 2020-08-23 null 200 FALSE user1 2020-08-24 300 130 TRUE user1 2020-08-25 400 0 TRUE
output:
user input_date starting_balance deposit flag user1 2020-08-20 100 20 TRUE user1 2020-08-21 200 0 TRUE user1 2020-08-24 300 330 TRUE user1 2020-08-25 400 0 TRUE
sql to recreate data:
select 'user1' as user, '2020-08-20' as input_date, 100 as starting_balance, 20 as deposit, true as flag union all select 'user1', '2020-08-21', 200, 0, true union all select 'user1', '2020-08-23', null, 200, false union all select 'user1', '2020-08-24', 300, 130, true union all select 'user1', '2020-08-25', 400, 0, true;
Advertisement
Answer
This is an aggregation problem. You can use a cumulative sum to assign groups. And then aggregate. The group is the number of “trues” up to each row. So:
select user, max(input_date), sum(starting_balance), sum(deposit), true as flag from (select t.*, sum(case when flag then 1 else 0 end) over (partition by user order by input_date desc) as grp from t ) t group by user, grp order by user, input_date;