Skip to content
Advertisement

SQL / Postgresql count multiple columns with conditions

I have a simple table of the form:

id gender a_feature (bool) b_feature (bool) xyz_feature (bool)

and I want to sum over all feature columns dependent on gender.

metric male female
a_feature 345 3423
b_feature 65 143
xyz_feature 133 5536

Is there a simple way to do this, e.g. using the information_schema.

I found only the solution below, but this is very ugly:

select
       'a_feature' as feature_name,
       count(case a_feature and gender = 'male') as male,
       count(case a_feature and gender = 'female') as female
from table
union
select
       b_feature as feature_name,
       count(case b_feature and gender = 'male') as male,
       count(case b_feature and gender = 'female') as female
from table
.
.
.
select
       xyz_feature as feature_name,
       count(case xyz_feature and gender = 'male') as male,
       count(case xyz_feature and gender = 'female') as female
from table

Advertisement

Answer

You can unpivot and aggregate. One method is:

select name,
       sum(case when feature and gender = 'male' then 1 else 0 end) as num_male,
       sum(case when feature and gender = 'female' then 1 else 0 end) as num_female
from ((select 'a_feature' as name, a_feature as feature, gender
       from t
      ) union all
      (select 'b_feature' as name, b_feature, gender
       from t
      ) union all
      . . .
     ) f
group by name;

In Postgres, you would unpivot using a lateral join:

select name,
       sum(case when feature and gender = 'male' then 1 else 0 end) as num_male,
       sum(case when feature and gender = 'female' then 1 else 0 end) as num_female
from t cross join lateral
     (values ('a_feature', a_feature),
             ('b_feature', b_feature),
             . . .
     ) v(name, feature)
group by name;

You can generate the list for values() using information_schema.columns if you are reluctant to type it all in.

EDIT:

You can construct the values clause using something like this:

select string_agg('(''' || column_name || ''', column_name)', ', ') 
from information_schema.columns
where table_name = ?
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement