Skip to content
Advertisement

SQL Query: Search Across Multiple Fields in JOINed Tables

SQL Version: MySQL 8.0 or SQL Server

SQL Fiddle: https://www.db-fiddle.com/f/wcHeXkcynUiYP3qzryoYJ7/6

I have a table of images and a table of tags that link to those images.

==================================  ===================================================
| tb_images                      |  | tb_imagetags                                    |
==================================  ===================================================
| f_imageID | f_imagefilename    |  | f_imagetagID | f_imagetagimage | f_imagetagname |
----------------------------------  ---------------------------------------------------
| 1         | 1.jpg              |  | 10           | 1               | November       |
| 2         | 2.jpg              |  | 11           | 1               | 2021           |
| 3         | 3.jpg              |  | 12           | 2               | November       |
==================================  | 13           | 2               | 2020           |
                                    | 14           | 3               | December       |
                                    | 15           | 3               | 2020           |
                                    ===================================================

I want to be able to pass (2) tags to the query and have it select only the images that match BOTH tags. For example, I want to pass November and 2021 and have it return only 1.jpg.

If I do something like this:

SELECT f_imageID, f_imagefilename 
FROM tb_images
LEFT JOIN tb_imagetags
  ON f_imagetagimage = f_imageID
  WHERE f_imagetagname = 'November'
    OR f_imagetagname = '2021'

But that returns:

f_imageID   f_imagefilename
================================
1           1.jpg
1           1.jpg
2           2.jpg

How can I rewrite this query to only get images that match both tags?

Advertisement

Answer

The problem is your data is related across different rows. If all the data was in the same row then it’d be easy

SELECT * FROM blah WHERE month = nov and year = 2021

When it’s in different rows you want to get both rows like you’re doing..

..but then you only want those images for which there are two rows. If there is only one row (eg only Nov or only 2021) you don’t want that

There are various ways to do it. One is to join the tags table to itself, having filtered one of the sides to just months and the other to just years

tb_imagetags tmonth 
JOIN tb_imagetags tyear 
ON 
  tmonth.f_imagetagname = 'November' AND
  tyear.f_imagetagname = '2021' AND
  tmonth.f_imagetagimage = tyear.f_imagetagimage

This would implicitly put month nov and year 2021 “on the same row” so only images with both those tags would appear in the join result..

..but probably the usual way we do such “across row” queries is to check the count after grouping them, or check that the min is x and the max is y, for example:

SELECT f_imageID, f_imagefilename 
FROM tb_images
INNER JOIN tb_imagetags
  ON f_imagetagimage = f_imageID
  WHERE f_imagetagname = 'November'
    OR f_imagetagname = '2021'
GROUP BY f_imageID
HAVING COUNT(*) = 2

Or

HAVING MIN(f_imagetagname) = '2021' AND MAX( f_imagetagname) = 'November'

The count thing works if the tag names are distinct. If you can double up November by accident then it’ll pick those up too. The min max only works for two tags.. you can also use something like

HAVING SUM(CASE f_imagetagname WHEN 'November' THEN 1 WHEN '2021' THEN 2 END) = 3

And that’s good for any number of criteria, you just go up in powers of two, so for 3 tags you case when 1,2,4 and demand the sum be 7, you could also go in powers of anything, like base 10.. go up in 1,10,100 and demand the sum be 111..

You can also ask that there exists a related row multiple times:

SELECT f_imageID, f_imagefilename 
FROM tb_images
WHERE 
  EXISTS(SELECT null FROM tb_imagetags WHERE f_imagetagimage = f_imageID AND f_imagetagname = 'November')
  AND
  EXISTS(SELECT null FROM tb_imagetags WHERE f_imagetagimage = f_imageID AND f_imagetagname = '2021')

EXISTS returns true if there is a row that meets the criteria: his sql means “images where there is some tag row that is November and there is some(other) tag row that is 2021”


Whatever you do, you need to think of a way to group the data up across the N rows where it exists and then do something that means the rows as a group meet the criteria. That’s a trick because we don’t usually think in those set terms as humans, we tend to think more “row by row”

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement