I have a dataset of updated properties:
user_id | flag | updated_at |
---|---|---|
123 | 1 | 101 |
123 | 0 | 101 |
123 | 1 | 102 |
123 | 1 | 103 |
123 | 1 | 104 |
124 | 0 | 101 |
124 | 0 | 102 |
124 | 0 | 110 |
I need to assign to each row a value when was the most recent change of flag
value, so it should look like this:
user_id | flag | updated_at | recent_updated_at |
---|---|---|---|
123 | 1 | 101 | 101 |
123 | 0 | 102 | 102 |
123 | 1 | 103 | 103 |
123 | 1 | 104 | 103 |
123 | 1 | 105 | 103 |
124 | 0 | 101 | 101 |
124 | 0 | 102 | 101 |
124 | 0 | 110 | 101 |
I have this query so far, but feel stuck at the moment:
with raw_dataset as ( SELECT 123 as user_id, 1 as flag, 101 as updated_at UNION ALL SELECT 123 as user_id, 0 as flag, 102 as updated_at UNION ALL SELECT 123 as user_id, 1 as flag, 103 as updated_at UNION ALL SELECT 123 as user_id, 0 as flag, 104 as updated_at UNION ALL SELECT 123 as user_id, 0 as flag, 105 as updated_at UNION ALL SELECT 124 as user_id, 1 as flag, 101 as updated_at UNION ALL SELECT 124 as user_id, 1 as flag, 103 as updated_at UNION ALL SELECT 124 as user_id, 1 as flag, 110 as updated_at ) , dataset as ( select * from ( select *, -- lag(flag, 1, -1) over (partition by user_id order by updated_at ASC) as lag_, row_number() over (partition by user_id, flag order by updated_at ASC) as rn from raw_dataset ) ) SELECT *, first_value(updated_at) over (partition by user_id,flag order by rn ASC) FROM dataset ORDER BY user_id, updated_at ASC
Any ideas, tips? Thanks in advance!
Advertisement
Answer
You appear to want the most recent time when the flag
value changed. Your question is rather confusing because the data in the question is different from the data in the code.
However, the follow returns the data in the question and what you want:
with raw_dataset as ( SELECT 123 as user_id, 1 as flag, 101 as updated_at UNION ALL SELECT 123 as user_id, 0 as flag, 102 as updated_at UNION ALL SELECT 123 as user_id, 1 as flag, 103 as updated_at UNION ALL SELECT 123 as user_id, 1 as flag, 104 as updated_at UNION ALL SELECT 123 as user_id, 1 as flag, 105 as updated_at UNION ALL SELECT 124 as user_id, 0 as flag, 101 as updated_at UNION ALL SELECT 124 as user_id, 0 as flag, 103 as updated_at UNION ALL SELECT 124 as user_id, 0 as flag, 110 as updated_at ) select rd.*, max(case when prev_flag is null or prev_flag <> flag then updated_at end) over ( partition by user_id order by updated_at ) as most_recent_updated_at from (select rd.*, lag(flag) over (partition by user_id order by updated_at) as prev_flag from raw_dataset rd ) rd;
The logic is pretty simple. Just look at the previous flag. Then do a cumulative max()
when the value changes.
Here is a db<>fiddle.