Here is my population dataset detailing the population of each state through 2010 – 2019:
Here is my consumer complaints dataset which has date received and state:
I already have the complaints for each state per year:
I know I need to join compltsbypear and uspop on state and divide 2011(from complaints) / 2011 from uspop, etc for each match year.
How do I write the query that shows the states and for each year, the ratio of complaints to population?
Advertisement
Answer
I would approach this by unpivoting the population data, so there is one row per state and year. Then aggregate the complaints and join
.
Your code looks like SQL Server, so I will use those conventions:
with up as ( select v.* from uspop u cross apply (values (u.area, 2010, u.[2010]), (u.area, 2011, u.[2011]), . . . ) v(state, year, pop) ), cc as ( select year(datereceived) as year, state, count(*) as num_complaints from consumercomplaints cc group by year(datereceived), state ) select up.state, up.year, cc.num_complaints, up.pop, (cc.num_complaints * 1.0 / up.pop) as complaint_ratio from up join cc on up.state = cc.state and up.year = cc.year;
This produces the results you want in a perfectly reasonable format of one row per state/year. If you want it in a different format, you seem to be aware of how to pivot the data.