Currently we have the following sample DB structure with a Many-to-Many relationship between lead
and tag
tables.
CREATE TABLE lead ( id serial constraint lead_id_pk primary key, name VARCHAR(20), surname VARCHAR(20) ); CREATE TABLE tag ( id serial constraint tag_id_pk primary key, name VARCHAR(20), description Text ); create table leads_tags ( id serial constraint lead_tags_id_pk primary key, lead_id integer not null constraint leads_tags_leads_id_fk references lead on delete cascade, tag_id integer not null constraint leads_tags_tags_id_fk references tag on delete cascade, constraint leads_tags_lead_id_tag_id_key unique (lead_id, tag_id) );
Let’s imagine that we would like to perform the following queries.
- Give me all the leads that have ANY of the tags in a list.
- Give me all the leads that have ALL of the tags in the list
For the first one (1) we can do:
Select distinct (id), * FROM lead where id in ( Select lt.lead_id from leads_tags lt where tag_id in (324, 129) )
For the second one (2) where we want to search for Leads that have both tags. Which is the most performant option according to your opinion? Is this the best approach?
SELECT * FROM lead WHERE id IN ( SELECT lt.lead_id FROM leads_tags lt GROUP BY lt.lead_id HAVING array_agg(tag_id) @> array [324,129] )
Advertisement
Answer
Provided unique (lead_id, tag_id)
I would say
SELECT * FROM lead WHERE id IN ( SELECT lt.lead_id FROM leads_tags lt GROUP BY lt.lead_id where tag_id in (324, 129) HAVING count(*) = 2 )