I have a simple table of the form:
id | gender | a_feature (bool) | b_feature (bool) | … | xyz_feature (bool) |
---|
and I want to sum over all feature columns dependent on gender.
metric | male | female |
---|---|---|
a_feature | 345 | 3423 |
b_feature | 65 | 143 |
… | … | … |
xyz_feature | 133 | 5536 |
Is there a simple way to do this, e.g. using the information_schema.
I found only the solution below, but this is very ugly:
select 'a_feature' as feature_name, count(case a_feature and gender = 'male') as male, count(case a_feature and gender = 'female') as female from table union select b_feature as feature_name, count(case b_feature and gender = 'male') as male, count(case b_feature and gender = 'female') as female from table . . . select xyz_feature as feature_name, count(case xyz_feature and gender = 'male') as male, count(case xyz_feature and gender = 'female') as female from table
Advertisement
Answer
You can unpivot and aggregate. One method is:
select name, sum(case when feature and gender = 'male' then 1 else 0 end) as num_male, sum(case when feature and gender = 'female' then 1 else 0 end) as num_female from ((select 'a_feature' as name, a_feature as feature, gender from t ) union all (select 'b_feature' as name, b_feature, gender from t ) union all . . . ) f group by name;
In Postgres, you would unpivot using a lateral join:
select name, sum(case when feature and gender = 'male' then 1 else 0 end) as num_male, sum(case when feature and gender = 'female' then 1 else 0 end) as num_female from t cross join lateral (values ('a_feature', a_feature), ('b_feature', b_feature), . . . ) v(name, feature) group by name;
You can generate the list for values()
using information_schema.columns
if you are reluctant to type it all in.
EDIT:
You can construct the values
clause using something like this:
select string_agg('(''' || column_name || ''', column_name)', ', ') from information_schema.columns where table_name = ?