I’m writing a query (using Athena – AWS) and I need to substitute all values from a group if there’s at least one occurrence of another value. To exemplify:
My original dataframe
x
ID v1 v2 v3
123 0 0 1
123 0 1 0
123 1 1 0
456 1 0 0
456 0 0 0
What I need: case when v1, v2 or v3 assume value 1 group by ID, then the whole column that contains 1 group by ID should be 1.
This is the output I need:
ID v1 v2 v3
123 1 1 1
456 1 0 0
Since v1, v2, v3 had at least one number 1 in the group ID 123, then all these columns for this group should be 1. However, ID 456 had at least one number 1 only in column v1, then v1 will be 1 and v2, v3 will remain 0.
Does someone know how to do that?
Advertisement
Answer
You may try the following
SELECT
ID,
MAX(v1) as v1,
MAX(v2) as v2,
MAX(v3) as v3
FROM
your_table
GROUP BY
ID