I’m trying to generate a query with left join. The idea is that I can get:
JOBA,SCRIPTA JOBA,SCRIPTC
Because I need only rows that have same columnA (job), with different columnB (bat).
I’m trying with left join but it is not working. Could you help me why my sql sentence is wrong?
create table definition
(
job varchar(256),
bat varchar(256)
);
INSERT INTO definition (job,bat)
VALUES
('JOBA','SCRIPTA'),
('JOBA','SCRIPTA'),
('JOBB','SCRIPTB'),
('JOBA','SCRIPTC'),
('JOBB','SCRIPTB');
with duplicates as
(
select distinct a.job,a.bat from definition a where job in (
SELECT job
FROM definition
GROUP BY job
HAVING COUNT(*) > 1) order by job desc
),
duplicatetwofields as
(
SELECT job, bat
FROM definition
GROUP BY job, bat
HAVING COUNT(*) > 1 order by job desc
)
select a.job,a.bat from duplicates a left join duplicatetwofields b
on (a.job=b.job)
--and a.bat=b.bat)
where a.bat is null or a.job is null;
Advertisement
Answer
GROUPBY with COUNT() will get you there, but the code in your original question is not exactly what you want:
SELECT job FROM definition GROUP BY job HAVING COUNT(*) > 1
You’re putting job values into groups and then pulling the value if the group count is greater than one. Both JOBA and JOBB exist more than once, so every job group has more than one member. This means JOBA and JOBB both get returned. What you want is something like this:
SELECT job FROM definition GROUP BY job HAVING COUNT(DISTINCT(bat)) > 1
This puts the jobs into groups, but only returns them if there is more than one distinct bat for every job. This query only returns JOBA because JOBA is the only group that has more than 1 distinct bat value.
You also don’t need to eliminate duplicates with a second CTE. Simply use a distinct on the final query to get unique results. Something like this should work given the parameters of the question (I used a subquery, it’s fine to use a CTE if you prefer):
SELECT DISTINCT d.*
FROM definition d
JOIN
(
SELECT job
FROM definition
GROUP BY job
HAVING COUNT(DISTINCT(bat)) > 1
) x
ON d.job = x.job