I want to fill NULL
values in device
column for each session_id
with an associated non-NULL value. How can I achieve that?
Here is the sample data:
+------------+-------+---------+ | session_id | step | device | +------------+-------+---------+ | 351acc | step1 | | | 351acc | step2 | | | 351acc | step3 | mobile | | 351acc | step4 | mobile | | 350bca | step1 | desktop | | 350bca | step2 | | | 350bca | step3 | | | 350bca | step4 | desktop | +------------+-------+---------+
Desired output:
+------------+-------+---------+ | session_id | step | device | +------------+-------+---------+ | 351acc | step1 | mobile | | 351acc | step2 | mobile | | 351acc | step3 | mobile | | 351acc | step4 | mobile | | 350bca | step1 | desktop | | 350bca | step2 | desktop | | 350bca | step3 | desktop | | 350bca | step4 | desktop | +------------+-------+---------+
Advertisement
Answer
The window function first_value()
with the right ordering is probably cheapest:
SELECT session_id, step , COALESCE(device , first_value(device) OVER (PARTITION BY session_id ORDER BY device IS NULL, step) ) AS device FROM tbl ORDER BY session_id DESC, step;
db<>fiddle here
ORDER BY device IS NULL, step
sorts NULL
values last, so the earliest step
with a notnull value is picked. See:
If notnull devices per session_id
are always the same, you can simplify to just ORDER BY device IS NULL
. And you don’t need COALESCE
.