How do I select posts that contain a specific tag if there is a many-to-many relationship between posts and tags?
The problem I am having is that because of the where tag.name = 'xxx'
, only that tag is selected. I want to select all posts that have the tag specified, together with all of their tags, e.g.,
Post 1 -> tag1, tag2 Post 2 -> tag1, tag3 Post 3 -> tag2, tag3
Currently what I get is:
Post 1 -> tag2 // missing tag1 Post 3 -> tag2 // missing tag3
Advertisement
Answer
Assuming these tables:
- Posts: id, author, date, content
- Tags: id, name
- PostTags: post_id, tag_id
The last table is often called a join table and facilitates a many-to-many relationship between Posts and Tags.
SELECT p.* FROM posts p JOIN posttags pt ON p.id = pt.post_id JOIN tags t ON pt.tag_id = t.id WHERE t.name = 'sql'
Basically, think of a many-to-many relationship as two one-to-many relationships, because that’s how they’re implemented in normal RDBMSs. So the above query has a one-to-many join from Posts to PostTags and another from Tags to PostTags.
The PostTags table I created has a composite primary key, being (post_id, tag_id)
. That combination will be unique. Many disfavour composite keys so you’ll often see people creating a primary key column:
- PostTags: id, post_id, tag_id
Either method is fine. It’s largely a philosophical difference.
Update: if you want to select all the posts that have a particular tag and all the tags those posts have then:
SELECT p.* FROM posts p JOIN posttags pt ON p.id = pt.post_id JOIN tags t ON pt.tag_id = t.id WHERE p.id IN (SELECT post_id FROM PostTags pt JOIN tags t ON pt.tag_id = t.id WHERE t.name = 'xyz')
Another way to do this is:
SELECT p.* FROM posts p JOIN posttags pt ON p.id = pt.post_id JOIN tags t ON pt.tag_id = t.id WHERE EXISTS (SELECT post_id FROM PostTags pt JOIN tags t ON pt.tag_id = t.id WHERE t.name = 'xyz' AND pt.post_id = p.id)
Which performs better will need to be tested and may vary depending on database vendor and version. A good optimizer (ie Oracle) will probably optimize them to perform the same. Others may not.
Now this will get you rows back like this:
Post 1, tag 1 Post 1, tag 2 Post 3, tag 2 Post 3, tag 3
so you’ll need to combine them, preferably in application logic rather than SQL. Some RDBMSs have vendor-specific extensions for this kind of thing, like MySQL’s GROUP_CONCAT()
function.