Skip to content
Advertisement

Removing SQL Rows from Query if two rows have an identical ID but differences in the columns

I´m currently working stuck on a SQL issue (well, mainly because I can´t find a way to google it and my SQL skills do not suffice to solve it myself)

I´m working on a system where documents are edited. If the editing process is finished, users mark the document as solved. In the MSSQL database, the corresponding row is not updated but instead, a new row is inserted. Thus, every document that has been processed has [e.g.: should have] multiple rows in the DB.

See the following situation:

ID ID2 AnotherCondition Steps Process Solved
1 1 yes Three ATAT AF
2 2 yes One ATAT FR
2 3 yes One ATAT EG
2 4 yes One ATAT AF
3 5 no One ABAT AF
4 6 yes One ATAT FR
5 7 no One AVAT EG
6 8 yes Two SATT FR
6 9 yes Two SATT EG
6 10 yes Two SATT AF

I need to select the rows which have not been processed yet. A “processed” document has a “FR” in the “Solved” column. Sadly other versions of the document exist in the DB, with other codes in the “Solved” columns.

Now: If there is a row which has “FR” in the “Solved” column I need to remove every row with the same ID from my SELECT statement as well. Is this doable?

In order to achieve this, I have to remove the rows with the IDs 2 | 4 (because the system sadly isn´t too reliable I guess) | and 6 in my select statement. Is this possible in general?

What I could do is to filter out the duplicates afterwards, in python/js/whatever. But I am curious whether I can “remove” these rows directly in the SQL statement as well.

To rephrase it another time: How can I make a select statement which returns only (in this example) the rows containing the ID´s 1, 3 and 5?

Advertisement

Answer

If you need to delete all rows where every id doesn’t have any “Solved = ‘no’“, you can use a DELETE statement that will exclude all “id” values that have at least one “Solved = ‘no’” in the corresponding rows.

DELETE FROM tab
WHERE id NOT IN (SELECT id FROM tab WHERE Solved1 = 'no');

Check the demo here.


Edit. If you need to use a SELECT statement, you can simply reverse the condition in the subquery:

SELECT * 
FROM tab
WHERE id NOT IN (SELECT id FROM tab WHERE Solved1 = 'yes');

Check the demo here.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement