I’m trying to do some crosstabs in SQL Server 2008 R2. That part is alright, however, if I try to get percentages for each cell, I run into a problem.
Here is a distilled use case: A survey where people give their favorite color and their favorite fruit. I’d like to know how many like a given fruit AND a given color:
with survey as (
select 'banana' fav_fruit, 'yellow' fav_color
union select 'banana', 'red'
union select 'apple', 'yellow'
union select 'grape', 'red'
union select 'apple', 'blue'
union select 'orange', 'purple'
union select 'pomegranate', 'green'
)
select
s.fav_color,
sum(case
when s.fav_fruit = 'banana' then 1
else 0
end) as banana,
sum(case
when s.fav_fruit = 'banana' then 1
else 0
end) / sum(1) -- why does division always yield 0? "+", "-", and "*" all behave as expected.
* 100 as banana_pct,
sum(1) as total
from
survey s
group by
s.fav_color;
Results:
fav_color banana banana_pct total
------------------------------------
blue 0 0 1
green 0 0 1
purple 0 0 1
red 1 0 2
yellow 1 0 2
What I was expecting:
fav_color banana banana_pct total
------------------------------------
blue 0 0 1
green 0 0 1
purple 0 0 1
red 1 50 2
yellow 1 50 2
Please help me to get what I was expecting?
Advertisement
Answer
You are using SQL Server. Here is a much simpler example that replicates the issue:
select 1/2
SQL Server does integer division.
Replace the denominator with something like sum(1.0)
or sum(cast 1 as float)
or sum(1e0)
instead of sum(1)
.
Contrary to my expectation at least, SQL Server stores numbers with decimal points as numeric/decimal type (see here) rather than float
. The fixed number of decimal spaces might affect subsequent operations.