Skip to content
Advertisement

SQL Case statement with Count?

I have a database that allows for more than one ethnicity per person. Unfortunately, our answers are essentially Yes Hispanic, Not Hispanic, and Unknown, and there are some who do indeed have multiple selections. I need to run a large query that pulls lots of info, one of which is ethnicity, and I want to “convert” those that have multiple selections as Unknown.

person_ethnicity_xref table:

Person_ID Ethnicity_ID
1234567 SLWOWQ
1234567 ZLKJDU

mstr_lists table:

Ethnicity_ID Ethnicity
SLWOWQ Hispanic
ZLKJDU Not Hispanic

I’ve been struggling with this as I can’t use a For XML Path with two tables, so I’m now trying to use the logic of Case When count(ethnicity_ID)>1 then ‘Unknown’ Else Ethnicity End

Here’s what I have

select
p.person_nbr,
case    
    when count(eth1.ethnicity_item_id)>1 then 'Unknown'
    else ml1.mstr_list_item_desc
end 'final eth'

from
person_table p
left join person_ethnicity_xref eth1  on p.person_id=eth1.person_id
left join  mstr_lists ml1 on eth1.ethnicity_item_id=ml1.mstr_list_item_id

group by
p.person_nbr,
ml1.mstr_list_item_desc

This gave me results but when I check them, those with >1 don’t have a value of Unknown and people are listed twice once with each ethnicity.

Another part of this larger query has a subquery in the FROM that counts race and a separate table join for only those with a count=1. Then the case says if the subquery that counts race came up with >1 then X otherwise use that other table for count=1. Because the race table also uses that mstr_list there’s then 5 tables involved (there’s a second person_id join now that I look at it more closely, and there’s a mstr_list to the count and the regular tables…I have no idea why, my brain is tired and that count table isn’t a simple count and also is doing something else). Is this really the only option? This query already takes over 10 min to run (it is not run on production!) and I’d hate to make it worse by duplicating what the previous writer did.

Advertisement

Answer

Use aggregation:

select p.person_nbr,
       (case when min(ml1.mstr_list_item_desc) = max(ml1.mstr_list_item_desc)
             then min(ml1.mstr_list_item_desc)
             else 'Unknown'
        end) as final_ethnicity
from person_table p left join
     person_ethnicity_xref eth1
     on p.person_id = eth1.person_id left join
     mstr_lists ml1
     on eth1.ethnicity_item_id = ml1.mstr_list_item_id
group by p.person_nbr;

Note: This slightly tweaks your logic. If there a multiple ethnicities and they are all the same, then that value is used.

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement