Skip to content
Advertisement

mySQL: Select rows if text column contains an item on a blacklist

I am trying to create a query that can select all posts from my database that contain a blacklisted tag unless the post contains a whitelisted tag. Tags are stored in a text column (i.e., " tag1 tag2 tag3 ").

Take for example this pseudo code:

let whitelist = ['tag1', 'tag3', 'tag5'];
let blacklist = ['tag2', 'tag4', 'tag6'];

connection.query('SELECT * FROM `posts` WHERE `tags` CONTAINS BLACKLIST TAG AND NOT WHITELIST TAG;', function (err, results) {
    // Returns all posts that have a blacklisted tag unless the post contains a tag on the whitelist
});

Is it possible to generate a query using the contents of whitelist and blacklist to select posts that contain a blacklisted tag unless they contain a whitelisted one?

Advertisement

Answer

The best thing to do here is to properly normalise your tags and not have a list of space-separated values. If you’re stuck with that format though, you can use regex to look for one of the values in each array inside the list of tags. You can make the regex from the whitelist and blacklist arrays using:

whitelist_regex = ' ' + whitelist.join(' | ') + ' ';
blacklist_regex = ' ' + blacklist.join(' | ') + ' ';

Then your query becomes

connection.query('SELECT * FROM `posts` WHERE `tags` RLIKE ? AND `tags` NOT RLIKE ?', 
                 [ blacklist_regex, whitelist_regex ],
                 function (err, results) {
                      // ...
                 });

Update

If the tags field looks like a series of space-separated values of the form tagtype:tag, you can change the two regexes to match that format:

whitelist_regex = ':' + whitelist.join(' |:') + ' ';
blacklist_regex = ':' + blacklist.join(' |:') + ' ';
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement