Skip to content
Advertisement

Creating extra columns based on condition (Case When Sum)

I have survey table where someone is asked roughly 5 questions. 4 of those questions are the same questions, but the options to their answers are different since they were to understand their purchase.

Here are the questions:

 ID                Question                            qid                         Answer
 101005 what brands did you purchase the past 5 months  1          Coca-Cola or Pepsi or Dr.Pepper
 101005 what brands did you purchase the past 5 months  1                       Dr.Pepper
 101005 what brands did you purchase the past 5 months  1                      store brand
 101005 what brands did you purchase the past 5 months  1                      Coca-Cola
 101005  how many people live in your house             4                           4
 101005  what is your prefer retailers                  8                          walmart 

The goal is to create four extra columns based on their answer and they will be assigned a 1 or 0. Since this person’s answer is coca cola, I want to assign them in the column of current_buyer and give them 1 and 0 will be new_buyer column. I also want to make sure that even though he answer Dr.Pepper in the second row, it still recognizes him as a current_buyer. In the same breath I want to assign this person a 1 in the 3rd column as a drinker and 0 4th column in prospect.

Here is how the table should look like

  ID             Question                             qid               answer                    Current_buyer    New_buyer   drinker    prospect      
 101005 what brands did you purchase the past 5 months  1        Coca-Cola or Pepsi or Dr.Pepper      1               0             1        0
 101005 what brands did you purchase the past 5 months  1                       Dr.Pepper             1               0             1        0
 101005 how many people live in your house              4                            4                1               0             1        0

The goal is to see if ID bought coca-cola the past 5 months, they are a current buyer (1) and drinker (1) and will have (0) for new_buyer and prospect in their entire profile.

Here is the code that I try:

 select ID,qid,question,answer,s_date,   
 Case when Sum(Case when [answer] like'%%coca-cola%%' then 1 else 0 end)>=1 then 1 
 else 0 end  current_buyer
 ,Case when Sum(Case when [answer] like'%%coca-cola%%' then 1 else 0 end)=0 then 1 
  else 0 end  New_buyer 
 ,Case when Sum(Case when [answer] like'%%coca-cola,Dr.pepper,pepsi%%' then 1 else 0 end)>=1 then 1 
  else 0 end  drinker
 ,Case when Sum(Case when [answer] like'%%coca-cola,Dr.pepper,pepsi%%' then 1 else 0 end)=0 then 1 
  else 0 end  Prospect

Unfortunately, using this code I’m getting 0 in the drinker column even though people selected coca-cola. Any help would be appreciated.

Advertisement

Answer

I’m not 100% sure that I understand the logic behind assigning the various values, but in general it seems that you are assigning the value based on an aggregation of multiple rows. Therefore you need to do the aggregation first (in this case in a CTE) and then join back to that for your query.

Also worth noting is that even though your are doing your matching with a LIKE operator – everything between the [%] characters will need to match exactly somewhere in your [answer]. I don’t think that this is what you want – so I’ve split them out. However I’m not 100% sure if I’ve got the logical operators completely correct.

Do you need something like:

;with analysisCTE as (select ID,qid,   
     Case when Sum(Case when [answer] like'%%coca-cola%%' then 1 else 0 end)>=1 then 1 
     else 0 end  current_buyer
     ,Case when Sum(Case when [answer] like'%%coca-cola%%' then 1 else 0 end)=0 then 1 
      else 0 end  New_buyer 
     ,Case when Sum(Case when [answer] like'%%coca-cola%%' OR [answer] like '%%Dr.pepper%%' OR [answer] like '%%pepsi%%' then 1 else 0 end)>=1 then 1 
      else 0 end  drinker
     ,Case when Sum(Case when [answer] like'%%coca-cola%%' OR [answer] like '%%Dr.pepper%%' OR [answer] like '%%pepsi%%' then 1 else 0 end)=0 then 1 
      else 0 end  Prospect
    from drinks
    group by id, qid)
select d.ID,d.qid,d.question,d.answer, a.current_buyer, a.New_buyer, a.drinker, a.Prospect
from drinks d
inner join analysisCTE a on d.id = a.id and d.qid = a.qid

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement