Skip to content
Advertisement

Snowflake: ListAgg data based on condition

I have a table P that looks something like this:

ID | Status | Env
1  |   1    | Linux
1  |   1    | Windows
1  |   3    | Mac
2  |   1    | Linux
2  |   1    | Windows
2  |   1    | Mac
3  |   3    | Linux
3  |   0    | Windows
3  |   3    | Mac

Here, 1 signifies test success whereas any other number signify some kind of failure. I want to aggregate this data in such a way that for every failed test I have a comma separated list of failed environments in every row. And if there are no failures, there should be NULL in the new columns. The output would look something like

ID | Status | Env     | Failure_list
1  |   1    | Linux   | Mac
1  |   1    | Windows | Mac
1  |   3    | Mac     | Mac
2  |   1    | Linux   | Null
2  |   1    | Windows | Null
2  |   1    | Mac     | Null
3  |   3    | Linux   | Linux, Windows, Mac
3  |   0    | Windows | Linux, Windows, Mac
3  |   3    | Mac     | Linux, Windows, Mac

I am using the snowflake LISTAGG() function in a query like

SELECT ID, STATUS, LISTAGG(ENV, ', ') 
FROM P
GROUP BY ID, STATUS

This is the output I get:

ID | Status | Env
1  |   1    | Linux, Windows
1  |   3    | Mac
2  |   1    | Linux, Windows, Mac
3  |   0    | Windows
3  |   3    | Linux, Mac

How can I change this query to get the output that I am looking for?

Advertisement

Answer

You could solve this with a correlated subquery:

select
    t.*,
    (
        select listagg(t1.env)
        from mytable t1
        where t1.id = t.id and t1.status <> 1
    ) failure_list
from mytable t

Or better yet using listagg() as a window function, which Snowflake supports:

select 
    t.*,
    listagg(case when status <> 1 then env end) over(partition by id) failure_list
from mytable t
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement