I have a table P
that looks something like this:
ID | Status | Env 1 | 1 | Linux 1 | 1 | Windows 1 | 3 | Mac 2 | 1 | Linux 2 | 1 | Windows 2 | 1 | Mac 3 | 3 | Linux 3 | 0 | Windows 3 | 3 | Mac
Here, 1
signifies test success whereas any other number signify some kind of failure. I want to aggregate this data in such a way that for every failed test I have a comma separated list of failed environments in every row. And if there are no failures, there should be NULL
in the new columns. The output would look something like
ID | Status | Env | Failure_list 1 | 1 | Linux | Mac 1 | 1 | Windows | Mac 1 | 3 | Mac | Mac 2 | 1 | Linux | Null 2 | 1 | Windows | Null 2 | 1 | Mac | Null 3 | 3 | Linux | Linux, Windows, Mac 3 | 0 | Windows | Linux, Windows, Mac 3 | 3 | Mac | Linux, Windows, Mac
I am using the snowflake LISTAGG() function in a query like
SELECT ID, STATUS, LISTAGG(ENV, ', ') FROM P GROUP BY ID, STATUS
This is the output I get:
ID | Status | Env 1 | 1 | Linux, Windows 1 | 3 | Mac 2 | 1 | Linux, Windows, Mac 3 | 0 | Windows 3 | 3 | Linux, Mac
How can I change this query to get the output that I am looking for?
Advertisement
Answer
You could solve this with a correlated subquery:
select t.*, ( select listagg(t1.env) from mytable t1 where t1.id = t.id and t1.status <> 1 ) failure_list from mytable t
Or better yet using listagg()
as a window function, which Snowflake supports:
select t.*, listagg(case when status <> 1 then env end) over(partition by id) failure_list from mytable t