Skip to content
Advertisement

SQL summing the same column with different date conditions

I’m using one table and am trying to sum total spend for two separate years, pulling from the same column.

Essentially I have a list of customers, and I’m trying to sum their 2018 spend, and their 2019 spend. I’ve tried a few different things, but I can’t seem to get the “case when” function to work because once the 2018 spend condition is met and that value is populated, it won’t sum for 2019, and vice versa—so I’ve got a total for 2018 OR 2019, but no customers are showing spend for both.

This is my query:

select * 
from
    (select 
         buyer_first_name, buyer_last_name, buyer_address_1, buyer_address_2,
         buyer_address_city, buyer_address_state, buyer_address_zip, buyer_email, buyer_phone_1,
         sum(case when sale_amount > 0 and year(sale_date) = 2018 then sale_amount end) as Spend18,
         sum(case when sale_amount > 0 and year(sale_date) = 2019 then sale_amount end) as Spend19
     from 
         database.table
     where 
         sale_date between date '2018-01-01' and date '2019-10-30'
     group by 
         buyer_first_name, buyer_last_name, buyer_address_1, buyer_address_2,
         buyer_address_city, buyer_address_state, buyer_address_zip, buyer_email, buyer_phone_1)

Any idea what I’m doing wrong? Thank you!

Advertisement

Answer

I really doubt that your problem is with SQL syntax or a bug. Your query looks like it should be doing what you want.

The issue is that those fields are not the same for any customer in the two years. Try something like this:

select buyer_last_name
       sum(case when sale_amount > 0 and year(sale_date) = 2018 then sale_amount end) as Spend18,
       sum(case when sale_amount > 0 and year(sale_date) = 2019 then sale_amount end) as Spend19
from database.table
where sale_date between date '2018-01-01' and date '2019-10-30'
group by buyer_last_name;

I speculate that last names are unlikely to change. If this works, you can start adding columns back to see where the problem columns are.

This is why databases are normalized. Repeated data tends to get out of synch.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement