Skip to content
Advertisement

Confused with Join logic

I find it so confused that my simple logic doesn’t work.

I have a data with 2 columns ID and code.

1 ID can link to multiple code.

So the sample data is like this:

ID   Code
--   ----
a    1
a    2
a    3
b    1
b    3
c    1
c    3
c    4
d    2
d    3
d    4

The rule is that:

  • Find ID that has code = 1 AND code not equal to 2 AND code =3

So clearly a is fail, b is good, c is good, d is fail.

I have 2 approaches below and neither of them work!

Can you please help?

Many thanks,

Harry

Approach #1:

create table mytable (
ID int,
code varchar(255)
);

insert into mytable (ID, code)
values('a','1'),('a','2'),('a','3'),('b','1'),('b','3'),('c','1'),('c','3'),('c','4'),('d','2'),('d','3'),('d','4');


select distinct 
    ID 
from 
    mytable 
where 
    code = 1
    and ID not in (select ID from mytable where code = 2)
    and ID in (select ID from mytable where code = 3);

Approach #2: join my table 2 times

select distinct(T1.ID) 
from
    (select distinct(mytable1.ID) 
     from mytable mytable1 
     join mytable mytable2 on  mytable1.ID = mytable2.ID
     where mytable1.code = 1 
       and mytable2.code not in ('2')) as T1
join
    (select distinct(mytable3.ID) 
     from mytable mytable3 
     where mytable3.code = 3) as T2 
on T1.ID = T2.ID

Advertisement

Answer

Welcome to S/O. Your posts will get better with time with reading others and additional feedback. Your question states one thing, but query offers another.

Your WHERE clause on its individual parts is asking for the code = 1, and the corresponding “ID” does not have a code = 2, but it DOES have a code = 3 IN ADDITION to code = 1.

where 
    code = 1
    and ID not in (select ID from mytable where code = 2)
    and ID in (select ID from mytable where code = 3);

Aside from aggregations, you can use the same table and self-join on the conditions you want. In this case, I might start with your base query of just the code = 1. Then I would join again on same ID, but it also having a code 3 (both 1 and 3 required). Then a third LEFT JOIN for code 2 and ensuring it is NOT found

select
      t1.id
   from
      MyTable t1
         -- now, does it also exist for the same "ID" but also have code = 3
         JOIN MyTable t2
            on t1.id = t2.id
            AND t2.code = 3
         -- here, I am explicitly LOOKING for code = 2 for same ID.
         -- but by doing a LEFT-JOIN, I only want where it does NOT exist
         LEFT JOIN MyTable t3
            on t1.id = t3.id 
            AND t3.code = 2
   where
          t1.code = 1
      -- explicitly making sure the is no match for same ID, code = 2
      AND t3.id IS NULL

Another approach is via aggregation, but getting a count of 1 & 3, another for code = 2 but making sure counts make sense to your needs.

select
      t1.id
   from
      MyTable t1
   having
          sum( case when t1.code = 1 then 1 else 0 end ) > 0
      AND sum( case when t1.code = 3 then 1 else 0 end ) > 0
      AND sum( case when t1.code = 2 then 1 else 0 end ) = 0
   group by
      t1.id

In the above, say you have multiple records for a given ID that are code = 1, you MAY get a count higher than 1. Hence I am summing just for code 1 and only care if it has AT LEAST 1. Similar requirement for code = 3 having the sum of AT LEAST 1. Finally, the sum of code = 2 making sure it is = 0 for its count.

FEEDBACK on SECOND query from OP

The reason for your failure on the second query is in the first part before your join for code test = 3

select distinct(T1.ID) 
from
    (select distinct(mytable1.ID) 
     from mytable mytable1 
     join mytable mytable2 on  mytable1.ID = mytable2.ID
     where mytable1.code = 1 
       and mytable2.code not in ('2')) as T1

You are querying where table1.code = 1 which is ok, and then your join from table1 to table2 is only on the ID. So, for example on your data with

id   code
a    1
a    2
a    3

your self join is returning records with codes 1, 2 AND 3. So even though you may not want #2, it is still pulling in IDs 1 & 3 which DO qualify, hence the logical failure. That is why in my FIRST example, I am doing a LEFT JOIN on same id and explicitly code = 2… but the where clause is explicitly EXCLUDING by where the table 3 alias ID is NULL, meaning it does NOT see a record for same ID, code = 2.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement