I’m writing a query (using Athena – AWS) and I need to substitute all values from a group if there’s at least one occurrence of another value. To exemplify:
My original dataframe
ID v1 v2 v3 123 0 0 1 123 0 1 0 123 1 1 0 456 1 0 0 456 0 0 0
What I need: case when v1, v2 or v3 assume value 1 group by ID, then the whole column that contains 1 group by ID should be 1.
This is the output I need:
ID v1 v2 v3 123 1 1 1 456 1 0 0
Since v1, v2, v3 had at least one number 1 in the group ID 123, then all these columns for this group should be 1. However, ID 456 had at least one number 1 only in column v1, then v1 will be 1 and v2, v3 will remain 0.
Does someone know how to do that?
Advertisement
Answer
You may try the following
SELECT ID, MAX(v1) as v1, MAX(v2) as v2, MAX(v3) as v3 FROM your_table GROUP BY ID