I tried to understand how the distinct (and therefore distinct count) work. I have a data with:
Sessionid, User_solici, code
While within 1 SessionID
there is 1 User_solici
. However, there could be multiple SessionID
for the same User_solici
.
For User_solici
of “a”, I have Session = 1, 55
and null
.
I want to find distinct SessionID
where (code = 7 and code = 1) OR (code = 7 and code = 2) for the same User_solici
.
My query is complete.
create table mytable ( sessionid varchar(255), User_solici varchar(255), code varchar(255) ); insert into mytable (sessionid, User_solici, code) values ('1', 'a', '7'), ('1', 'a', '1'), ('1', 'a', '2'), ('1', 'a', '999'), ('2', 'b', '7'), ('2', 'b', '1'), ('2', 'b', '2'), ('2', 'b', '999'), ('55', 'a', '7'), ('55', 'a', '1'), ('55', 'a', '2'), ('', 'a', '7'), ('', 'a', '1'), ('', 'a', '2'), ('', 'c', '7'), ('', 'c', '1'), ('', 'c', '2'); select distinct T1.sessionid from mytable T1 join mytable T2 on T1.User_solici = T2.User_solici and T1.code = '7' and T2.code in ('1', '2'); select T1.sessionid, T1.User_solici, T1.code as T1code, T2.code as T2code from mytable T1 join mytable T2 on T1.User_solici = T2.User_solici and T1.code = '7' and T2.code in ('1', '2');
But I try to understand what SQL does.
Look at the raw table, I see 5 sessionid (1,a), (2,b), (55,a), (,a) (,c) that satisfy.
OR I can argue only 4 sessionid works (1),(2),(55), and (blank)
I thought that the software will go to row 1, if code=7, then join on User_solici, when done, select distinct (in this case will be session 1 and blank), put this distinct to count (or final file). Done with row 1, go to row 2. This case, it will create 5 count.
Can anyone help me to understand how SQL handle the join (or null) in this case?
Thank you,
Harry
Advertisement
Answer
First, the DISTINCT clause will only return a single instance of all columns in your select regardless of how many records qualify the criteria.
Your first query actually works given your second assessment of 1, 2, 55 and blank. But because the blank session has multiple users, THAT is the break in your count. So, if you did DISTINCT SessionID AND User_Solici you will get the 5 entries I THINK you are looking for. I would just shift your T1 criteria into the WHERE clause as that has no bearing on the JOIN to T2 table
select distinct T1.sessionid, T1.User_solici from mytable T1 join mytable T2 on T1.User_solici = T2.User_solici and T2.code in ('1','2') where T1.code = '7'