Skip to content
Advertisement

SQLite: how to exclude records in a database with join tables?

I have a database with a schema similar to:

CREATE TABLE IF NOT EXISTS song_artist
(
    song INTEGER, 
    artist INTEGER,

    FOREIGN KEY("song") REFERENCES "Songs"("song_id") 
        ON DELETE CASCADE,
    FOREIGN KEY("artist") REFERENCES "Artists"("artist_id") 
        ON DELETE CASCADE
);

CREATE TABLE IF NOT EXISTS Artists 
(
    "artist_id" INTEGER PRIMARY KEY,
    "artist_name" TEXT
);

CREATE TABLE IF NOT EXISTS Songs 
(
    "song_id" INTEGER PRIMARY KEY,
    "song_name" TEXT
);

Say they have the following data in them:

INSERT INTO Artists (artist_name) VALUES ("Bob");
INSERT INTO Artists (artist_name) VALUES ("Shawn");

INSERT INTO Songs (song_name) VALUES ("one song");
INSERT INTO Songs (song_name) VALUES ("another song");

INSERT INTO song_artist (artist, song) VALUES (1, 1);
INSERT INTO song_artist (artist, song) VALUES (2, 1);
INSERT INTO song_artist (artist, song) VALUES (2, 2);

Artists:

artist_id artist_name
1 Bob
2 Shawn

Songs:

song_id song_name
1 one song
2 another song

song_artist:

song artist
1 1
1 2
2 2

Here, both Bob and Shawn are involved in one song, and only Shawn is involved in another song. Say I wanted to get the names of the songs where Shawn is involved, but not Bob. So, in this case it would be only another song. However, as to be expected, this query doesn’t work as it gives me both the songs:

SELECT song_name
FROM Songs
JOIN Artists ON Artists.artist_id = song_artist.artist
JOIN song_artist ON song_artist.song = Songs.song_id
WHERE Artists.artist_name = "Shawn"
  AND Artists.artist_name != "Bob"
song_name
1 one song
2 another song

This would be the result of the join table song_artist having multiple records for the same song, so even if a record with Bob‘s ID is excluded, there’s still another record without Bob‘s ID that gets included in the final result.

Here’s how I’ve found a workaround to this:

SELECT song_name
FROM Songs
JOIN Artists ON Artists.artist_id = song_artist.artist
JOIN song_artist ON song_artist.song = Songs.song_id
WHERE Artists.artist_name = "Shawn"
  AND Songs.song_id NOT IN (SELECT song_id
                            FROM Songs
                            JOIN Artists ON Artists.artist_id = song_artist.artist
                            JOIN song_artist ON song_artist.song = Songs.song_id
                            WHERE Artists.artist_name == "Bob")
song_name
1 another song

Now, while this works as expected, as one can imagine, this becomes a long winded query once you have a lot more than a single join table. I also assume this is a very un-optimized query. My question is, what is a better way to do this kind of query?

Advertisement

Answer

First, add a WHERE clause in your query which filters out any other artists than Shawn and Bob.
Then use conditional aggregation and in the HAVING clause add the condition that Bob does not participate in the song (which leaves only Shawn):

SELECT s.*
FROM Songs s
JOIN song_artist sa ON sa.song = s.song_id
JOIN Artists a ON a.artist_id = sa.artist
WHERE a.artist_name IN ('Shawn', 'Bob')
GROUP BY s.song_id
HAVING SUM(a.artist_name = 'Bob') = 0;

See the demo.

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement