I find it so confused that my simple logic doesn’t work.
I have a data with 2 columns ID and code.
1 ID can link to multiple code.
So the sample data is like this:
ID Code -- ---- a 1 a 2 a 3 b 1 b 3 c 1 c 3 c 4 d 2 d 3 d 4
The rule is that:
- Find ID that has code = 1 AND code not equal to 2 AND code =3
So clearly a is fail, b is good, c is good, d is fail.
I have 2 approaches below and neither of them work!
Can you please help?
Many thanks,
Harry
Approach #1:
create table mytable ( ID int, code varchar(255) ); insert into mytable (ID, code) values('a','1'),('a','2'),('a','3'),('b','1'),('b','3'),('c','1'),('c','3'),('c','4'),('d','2'),('d','3'),('d','4'); select distinct ID from mytable where code = 1 and ID not in (select ID from mytable where code = 2) and ID in (select ID from mytable where code = 3);
Approach #2: join my table 2 times
select distinct(T1.ID) from (select distinct(mytable1.ID) from mytable mytable1 join mytable mytable2 on mytable1.ID = mytable2.ID where mytable1.code = 1 and mytable2.code not in ('2')) as T1 join (select distinct(mytable3.ID) from mytable mytable3 where mytable3.code = 3) as T2 on T1.ID = T2.ID
Advertisement
Answer
Welcome to S/O. Your posts will get better with time with reading others and additional feedback. Your question states one thing, but query offers another.
Your WHERE clause on its individual parts is asking for the code = 1, and the corresponding “ID” does not have a code = 2, but it DOES have a code = 3 IN ADDITION to code = 1.
where code = 1 and ID not in (select ID from mytable where code = 2) and ID in (select ID from mytable where code = 3);
Aside from aggregations, you can use the same table and self-join on the conditions you want. In this case, I might start with your base query of just the code = 1. Then I would join again on same ID, but it also having a code 3 (both 1 and 3 required). Then a third LEFT JOIN for code 2 and ensuring it is NOT found
select t1.id from MyTable t1 -- now, does it also exist for the same "ID" but also have code = 3 JOIN MyTable t2 on t1.id = t2.id AND t2.code = 3 -- here, I am explicitly LOOKING for code = 2 for same ID. -- but by doing a LEFT-JOIN, I only want where it does NOT exist LEFT JOIN MyTable t3 on t1.id = t3.id AND t3.code = 2 where t1.code = 1 -- explicitly making sure the is no match for same ID, code = 2 AND t3.id IS NULL
Another approach is via aggregation, but getting a count of 1 & 3, another for code = 2 but making sure counts make sense to your needs.
select t1.id from MyTable t1 having sum( case when t1.code = 1 then 1 else 0 end ) > 0 AND sum( case when t1.code = 3 then 1 else 0 end ) > 0 AND sum( case when t1.code = 2 then 1 else 0 end ) = 0 group by t1.id
In the above, say you have multiple records for a given ID that are code = 1, you MAY get a count higher than 1. Hence I am summing just for code 1 and only care if it has AT LEAST 1. Similar requirement for code = 3 having the sum of AT LEAST 1. Finally, the sum of code = 2 making sure it is = 0 for its count.
FEEDBACK on SECOND query from OP
The reason for your failure on the second query is in the first part before your join for code test = 3
select distinct(T1.ID) from (select distinct(mytable1.ID) from mytable mytable1 join mytable mytable2 on mytable1.ID = mytable2.ID where mytable1.code = 1 and mytable2.code not in ('2')) as T1
You are querying where table1.code = 1 which is ok, and then your join from table1 to table2 is only on the ID. So, for example on your data with
id code a 1 a 2 a 3
your self join is returning records with codes 1, 2 AND 3. So even though you may not want #2, it is still pulling in IDs 1 & 3 which DO qualify, hence the logical failure. That is why in my FIRST example, I am doing a LEFT JOIN on same id and explicitly code = 2… but the where clause is explicitly EXCLUDING by where the table 3 alias ID is NULL, meaning it does NOT see a record for same ID, code = 2.