I´m currently working stuck on a SQL issue (well, mainly because I can´t find a way to google it and my SQL skills do not suffice to solve it myself)
I´m working on a system where documents are edited. If the editing process is finished, users mark the document as solved. In the MSSQL database, the corresponding row is not updated but instead, a new row is inserted. Thus, every document that has been processed has [e.g.: should have] multiple rows in the DB.
See the following situation:
ID | ID2 | AnotherCondition | Steps | Process | Solved |
---|---|---|---|---|---|
1 | 1 | yes | Three | ATAT | AF |
2 | 2 | yes | One | ATAT | FR |
2 | 3 | yes | One | ATAT | EG |
2 | 4 | yes | One | ATAT | AF |
3 | 5 | no | One | ABAT | AF |
4 | 6 | yes | One | ATAT | FR |
5 | 7 | no | One | AVAT | EG |
6 | 8 | yes | Two | SATT | FR |
6 | 9 | yes | Two | SATT | EG |
6 | 10 | yes | Two | SATT | AF |
I need to select the rows which have not been processed yet. A “processed” document has a “FR” in the “Solved” column. Sadly other versions of the document exist in the DB, with other codes in the “Solved” columns.
Now: If there is a row which has “FR” in the “Solved” column I need to remove every row with the same ID from my SELECT statement as well. Is this doable?
In order to achieve this, I have to remove the rows with the IDs 2 | 4 (because the system sadly isn´t too reliable I guess) | and 6 in my select statement. Is this possible in general?
What I could do is to filter out the duplicates afterwards, in python/js/whatever. But I am curious whether I can “remove” these rows directly in the SQL statement as well.
To rephrase it another time: How can I make a select statement which returns only (in this example) the rows containing the ID´s 1, 3 and 5?
Advertisement
Answer
If you need to delete all rows where every id doesn’t have any “Solved = ‘no’“, you can use a DELETE
statement that will exclude all “id” values that have at least one “Solved = ‘no’” in the corresponding rows.
DELETE FROM tab WHERE id NOT IN (SELECT id FROM tab WHERE Solved1 = 'no');
Check the demo here.
Edit. If you need to use a SELECT statement, you can simply reverse the condition in the subquery:
SELECT * FROM tab WHERE id NOT IN (SELECT id FROM tab WHERE Solved1 = 'yes');
Check the demo here.