I’m using one table and am trying to sum total spend for two separate years, pulling from the same column.
Essentially I have a list of customers, and I’m trying to sum their 2018 spend, and their 2019 spend. I’ve tried a few different things, but I can’t seem to get the “case when” function to work because once the 2018 spend condition is met and that value is populated, it won’t sum for 2019, and vice versa—so I’ve got a total for 2018 OR 2019, but no customers are showing spend for both.
This is my query:
select * from (select buyer_first_name, buyer_last_name, buyer_address_1, buyer_address_2, buyer_address_city, buyer_address_state, buyer_address_zip, buyer_email, buyer_phone_1, sum(case when sale_amount > 0 and year(sale_date) = 2018 then sale_amount end) as Spend18, sum(case when sale_amount > 0 and year(sale_date) = 2019 then sale_amount end) as Spend19 from database.table where sale_date between date '2018-01-01' and date '2019-10-30' group by buyer_first_name, buyer_last_name, buyer_address_1, buyer_address_2, buyer_address_city, buyer_address_state, buyer_address_zip, buyer_email, buyer_phone_1)
Any idea what I’m doing wrong? Thank you!
Advertisement
Answer
I really doubt that your problem is with SQL syntax or a bug. Your query looks like it should be doing what you want.
The issue is that those fields are not the same for any customer in the two years. Try something like this:
select buyer_last_name sum(case when sale_amount > 0 and year(sale_date) = 2018 then sale_amount end) as Spend18, sum(case when sale_amount > 0 and year(sale_date) = 2019 then sale_amount end) as Spend19 from database.table where sale_date between date '2018-01-01' and date '2019-10-30' group by buyer_last_name;
I speculate that last names are unlikely to change. If this works, you can start adding columns back to see where the problem columns are.
This is why databases are normalized. Repeated data tends to get out of synch.