I tried to understand how the distinct (and therefore distinct count) work. I have a data with:
Sessionid, User_solici, code
While within 1 SessionID
there is 1 User_solici
. However, there could be multiple SessionID
for the same User_solici
.
For User_solici
of “a”, I have Session = 1, 55
and null
.
I want to find distinct SessionID
where (code = 7 and code = 1) OR (code = 7 and code = 2) for the same User_solici
.
My query is complete.
create table mytable
(
sessionid varchar(255),
User_solici varchar(255),
code varchar(255)
);
insert into mytable (sessionid, User_solici, code)
values
('1', 'a', '7'),
('1', 'a', '1'),
('1', 'a', '2'),
('1', 'a', '999'),
('2', 'b', '7'),
('2', 'b', '1'),
('2', 'b', '2'),
('2', 'b', '999'),
('55', 'a', '7'),
('55', 'a', '1'),
('55', 'a', '2'),
('', 'a', '7'),
('', 'a', '1'),
('', 'a', '2'),
('', 'c', '7'),
('', 'c', '1'),
('', 'c', '2');
select distinct
T1.sessionid
from
mytable T1
join
mytable T2 on T1.User_solici = T2.User_solici
and T1.code = '7'
and T2.code in ('1', '2');
select
T1.sessionid,
T1.User_solici,
T1.code as T1code,
T2.code as T2code
from
mytable T1
join
mytable T2 on T1.User_solici = T2.User_solici
and T1.code = '7'
and T2.code in ('1', '2');
But I try to understand what SQL does.
Look at the raw table, I see 5 sessionid (1,a), (2,b), (55,a), (,a) (,c) that satisfy.
OR I can argue only 4 sessionid works (1),(2),(55), and (blank)
I thought that the software will go to row 1, if code=7, then join on User_solici, when done, select distinct (in this case will be session 1 and blank), put this distinct to count (or final file). Done with row 1, go to row 2. This case, it will create 5 count.
Can anyone help me to understand how SQL handle the join (or null) in this case?
Thank you,
Harry
Advertisement
Answer
First, the DISTINCT clause will only return a single instance of all columns in your select regardless of how many records qualify the criteria.
Your first query actually works given your second assessment of 1, 2, 55 and blank. But because the blank session has multiple users, THAT is the break in your count. So, if you did DISTINCT SessionID AND User_Solici you will get the 5 entries I THINK you are looking for. I would just shift your T1 criteria into the WHERE clause as that has no bearing on the JOIN to T2 table
select distinct
T1.sessionid,
T1.User_solici
from
mytable T1
join mytable T2
on T1.User_solici = T2.User_solici
and T2.code in ('1','2')
where
T1.code = '7'