Given a table
CREATE TABLE data( irs_number VARCHAR (50), mop_up INTEGER, ou VARCHAR (50) );
How would I return all matching records that…
- have at least one identical value for
irs_numberin another row AND - at least one
mop_upof those with the sameirs_numbermust be set to1AND - the
ouvalues are not identical, i.e. only return those not matching to a row with the identicalirs_number.
… so that all irs_numbers would be returned (not only one where the conditions are true – see example below).
I tried this but the query cannot finish within reasonable time:
SELECT irs_number, mop_up, ou FROM data outer_data WHERE (SELECT count(*) FROM data inner_data WHERE inner_data.irs_number = outer_data.irs_number AND inner_data.mop_up = 1 OR outer_data.mop_up = 1 AND inner_data.ou <> outer_data.ou );
As well as variations of duplicate counts as described here: How to find duplicate records in PostgreSQL – they will always just return the duplicates but not the proper filter applied.
edit:
Example data:
INSERT INTO data VALUES
('0001', 1, 'abc'),
('0001', 0, 'abc'),
('0001', 0, 'cde'),
('0001', 0, 'abc'),
('0002', 1, 'abc'),
('0002', 0, 'abc'),
('0003', 0, 'abc'),
('0003', 0, 'xyz')
;
SQLFiddle: http://sqlfiddle.com/#!17/be28f
a query should ideally return:
irs_number mop_up ou ----------------------- 0001 1 abc 0001 0 abc 0001 0 cde 0001 0 abc
(order not important)
meaning it should return all rows matching having the irs_number with the condition above.
Advertisement
Answer
You should be able to do this with a simple exists clause:
SELECT irs_number, mop_up, ou
FROM data d
WHERE EXISTS (SELECT 1
FROM data d2
WHERE d2.irs_number = d.irs_number AND
d2.mop_up = 1 AND
d2.ou <> d.ou
);
EDIT:
The above misinterpreted the question. It assumed that a mop_up = 1 needed to be on a different ou. As I read the question, this is ambiguous but doesn’t appear to be what you want. So, two exists address this:
SELECT irs_number, mop_up, ou
FROM data d
WHERE EXISTS (SELECT 1
FROM data d2
WHERE d2.irs_number = d.irs_number AND
d2.mop_up = 1
) AND
EXISTS (SELECT 1
FROM data d2
WHERE d2.irs_number = d.irs_number AND
d2.ou <> d.ou
);
Here is a db<>fiddle.
Both these solutions will be able to take advantage of an index on (irs_number, mop_up, ou).