Skip to content
Advertisement

aggregating Posts with all their Hashtags using Postgres

I have something like Posts in FB/instagram. When a user searches for something, the input query is used to either find Posts that either have this query in their content, OR – one or more of post’s hashtags match this query.

CREATE TABLE posts (
    id int UNIQUE NOT NULL generated always as identity,
    content text,
    content_tokens tsvector,
    
    PRIMARY KEY (id)
);

CREATE TABLE tags (
    id int UNIQUE NOT NULL generated always as identity,
    tag_title text,
    
    PRIMARY KEY (id)
);

CREATE TABLE post_tags(
    post_id int,
    tag_id   int,
    
    PRIMARY KEY (post_id, tag_id),
    FOREIGN KEY (post_id) references posts (id) ON DELETE CASCADE,
    FOREIGN KEY (tag_id) references tags (id) ON DELETE CASCADE
);


-- query to find all "relevant" posts :

SELECT posts.id, posts.title, posts.description, tags.id as tag_id, tags.title as tag_title

FROM posts
JOIN post_tags
  ON posts.id = post_tags.post_id
JOIN tags 
  ON tags.id = post_tags.tag_id
  
WHERE tags.title ilike 'user_query'
   OR posts.content_tokens @@ plainto_tsquery('user_query:*');

The problem :

I can’t figure out how to have Posts returned with a nested array of all its pertinent Tags. So for example Post -> {post_id: 123 , tags : [{id: 1, title: a}, {id: 2, title : b}, ...]}

Also, it results in duplicate records since I can’t GROUP BY only by one column (post_id).

Examples:

posts tables has:

id    content 
------------------------
1     this is a test
2     this is a test 2
3     this is user_query

tags tables has:

id   title
---------------
1    user_query
2    something
3    hey

post_tags has:

post_id    tag_id 
------------------
1          1
1          2
2          2
3          2
3          3

So in this example, post_id = 1 has the “user_query” inside one of its tags, and post_id = 3 has the “user_query” inside its content. So, both of them should be returned.

[
  {"post_id" : 1, "tags" : [{"id" : 1, "title": "user_query"}, {"id" : 2, "title" : "something"}]}, 
  {"post_id" : 3, "tags" : [{"id" : 3, "title": "hey"}]}
]

Advertisement

Answer

Do you want JSON aggrgation?

SELECT p.id, p.title, p.description, 
    JSONB_AGG(JSONB_BUILD_OBJECT('id', t.id, 'title', t.title)) as tags
FROM posts p
INNER JOIN post_tags pt ON p.id = pt.post_id
INNER JOIN tags t ON t.id = pt.tag_id
GROUP BY p.id
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement