I’m trying to write a simple query to take a data set that looks like this:
ID | Col2
X B
X C
Y B
Y D
and return this:
ID | Col2 | Col3
X B C
Y B D
Essentially, I have an ID column that can have either B, C, or D in Col2. I am trying to identify which IDs only have B and D. I have a query to find both, but not only that combination. Query:
select ID, Col2
from Table1
where ID in (
select ID from Table1
group by ID
having count(distinct Col2) = 2)
order by ID
Alternatively, I could use help in finding a way to filter that query on B and D and leave off B and C. I have seen perhaps a self join, but am not sure how to implement that.
Thanks!
EDIT: Most of the data set has, for a given ID, all three of B, C, and D. The goal here is to isolate the IDs that are missing one, namely missing C.
Advertisement
Answer
I am trying to identify which IDs only have B and D. I have a query to find both
If this is what you want, you don’t need multiple columns:
select id
from table1
where col2 in ('B', 'D')
group by id
having count(distinct col2) = 2;
If you want only 'B'
and 'D'
and no others, then:
select id
from table1
group by id
having sum(case when col2 = 'B' then 1 else 0 end) > 0 AND
sum(case when col2 = 'C' then 1 else 0 end) > 0 AND
sum(case when col2 not in ('B', 'D') then 1 else 0 end) = 0;
If there are only two columns, you can also easily pivot the values using aggregation:
select id, min(col2), nullif(max(col2), min(col2))
from table1
group by id;