SQL Version: MySQL 8.0 or SQL Server
I have a table of images and a table of tags that link to those images.
================================== =================================================== | tb_images | | tb_imagetags | ================================== =================================================== | f_imageID | f_imagefilename | | f_imagetagID | f_imagetagimage | f_imagetagname | ---------------------------------- --------------------------------------------------- | 1 | 1.jpg | | 10 | 1 | November | | 2 | 2.jpg | | 11 | 1 | 2021 | | 3 | 3.jpg | | 12 | 2 | November | ================================== | 13 | 2 | 2020 | | 14 | 3 | December | | 15 | 3 | 2020 | ===================================================
I want to be able to pass (2) tags to the query and have it select only the images that match BOTH tags. For example, I want to pass
2021 and have it return only
If I do something like this:
SELECT f_imageID, f_imagefilename FROM tb_images LEFT JOIN tb_imagetags ON f_imagetagimage = f_imageID WHERE f_imagetagname = 'November' OR f_imagetagname = '2021'
But that returns:
f_imageID f_imagefilename ================================ 1 1.jpg 1 1.jpg 2 2.jpg
How can I rewrite this query to only get images that match both tags?
The problem is your data is related across different rows. If all the data was in the same row then it’d be easy
SELECT * FROM blah WHERE month = nov and year = 2021
When it’s in different rows you want to get both rows like you’re doing..
..but then you only want those images for which there are two rows. If there is only one row (eg only Nov or only 2021) you don’t want that
There are various ways to do it. One is to join the tags table to itself, having filtered one of the sides to just months and the other to just years
tb_imagetags tmonth JOIN tb_imagetags tyear ON tmonth.f_imagetagname = 'November' AND tyear.f_imagetagname = '2021' AND tmonth.f_imagetagimage = tyear.f_imagetagimage
This would implicitly put month nov and year 2021 “on the same row” so only images with both those tags would appear in the join result..
..but probably the usual way we do such “across row” queries is to check the count after grouping them, or check that the min is x and the max is y, for example:
SELECT f_imageID, f_imagefilename FROM tb_images INNER JOIN tb_imagetags ON f_imagetagimage = f_imageID WHERE f_imagetagname = 'November' OR f_imagetagname = '2021' GROUP BY f_imageID HAVING COUNT(*) = 2
HAVING MIN(f_imagetagname) = '2021' AND MAX( f_imagetagname) = 'November'
The count thing works if the tag names are distinct. If you can double up November by accident then it’ll pick those up too. The min max only works for two tags.. you can also use something like
HAVING SUM(CASE f_imagetagname WHEN 'November' THEN 1 WHEN '2021' THEN 2 END) = 3
And that’s good for any number of criteria, you just go up in powers of two, so for 3 tags you case when 1,2,4 and demand the sum be 7, you could also go in powers of anything, like base 10.. go up in 1,10,100 and demand the sum be 111..
You can also ask that there exists a related row multiple times:
SELECT f_imageID, f_imagefilename FROM tb_images WHERE EXISTS(SELECT null FROM tb_imagetags WHERE f_imagetagimage = f_imageID AND f_imagetagname = 'November') AND EXISTS(SELECT null FROM tb_imagetags WHERE f_imagetagimage = f_imageID AND f_imagetagname = '2021')
EXISTS returns true if there is a row that meets the criteria: his sql means “images where there is some tag row that is November and there is some(other) tag row that is 2021”
Whatever you do, you need to think of a way to group the data up across the N rows where it exists and then do something that means the rows as a group meet the criteria. That’s a trick because we don’t usually think in those set terms as humans, we tend to think more “row by row”