Select ALL instead of ANY in a Many-to-Many relationship

Currently we have the following sample DB structure with a Many-to-Many relationship between lead and tag tables.

CREATE TABLE lead ( id serial
        constraint lead_id_pk
            primary key, name VARCHAR(20), surname VARCHAR(20) );

CREATE TABLE tag ( id serial
        constraint tag_id_pk
            primary key, name VARCHAR(20), description Text );

create table leads_tags
(
    id         serial
        constraint lead_tags_id_pk
            primary key,
    lead_id    integer not null
        constraint leads_tags_leads_id_fk
            references lead
            on delete cascade,
    tag_id     integer not null
        constraint leads_tags_tags_id_fk
            references tag
            on delete cascade,
    constraint leads_tags_lead_id_tag_id_key
        unique (lead_id, tag_id)
);

​x
 
CREATE TABLE lead ( id serial        constraint lead_id_pk            primary key, name VARCHAR(20), surname VARCHAR(20) );​CREATE TABLE tag ( id serial        constraint tag_id_pk            primary key, name VARCHAR(20), description Text );​create table leads_tags(    id         serial        constraint lead_tags_id_pk            primary key,    lead_id    integer not null        constraint leads_tags_leads_id_fk            references lead            on delete cascade,    tag_id     integer not null        constraint leads_tags_tags_id_fk            references tag            on delete cascade,    constraint leads_tags_lead_id_tag_id_key        unique (lead_id, tag_id));​

Let’s imagine that we would like to perform the following queries.

Give me all the leads that have ANY of the tags in a list.
Give me all the leads that have ALL of the tags in the list

For the first one (1) we can do:

Select distinct (id), *
FROM lead
where id in (
    Select lt.lead_id
    from leads_tags lt
    where tag_id in (324, 129)
    )

 
Select distinct (id), *FROM leadwhere id in (    Select lt.lead_id    from leads_tags lt    where tag_id in (324, 129)    )​

For the second one (2) where we want to search for Leads that have both tags. Which is the most performant option according to your opinion? Is this the best approach?

SELECT *
FROM lead
WHERE id IN (
    SELECT lt.lead_id
    FROM leads_tags lt
    GROUP BY lt.lead_id
    HAVING array_agg(tag_id) @> array [324,129]
)

 
SELECT *FROM leadWHERE id IN (    SELECT lt.lead_id    FROM leads_tags lt    GROUP BY lt.lead_id    HAVING array_agg(tag_id) @> array [324,129])​

Answer

Provided unique (lead_id, tag_id) I would say

SELECT *
FROM lead
WHERE id IN (
    SELECT lt.lead_id
    FROM leads_tags lt
    GROUP BY lt.lead_id
    where tag_id in (324, 129)
    HAVING count(*) = 2
)

 
SELECT *FROM leadWHERE id IN (    SELECT lt.lead_id    FROM leads_tags lt    GROUP BY lt.lead_id    where tag_id in (324, 129)    HAVING count(*) = 2)​

Advertisement

Answer