I want to get a record from a joint table at a time. But I don’t hope the tables are joined as a whole.
The actual tables are as follow.
table contents -- stores content information. +----+----------+----------+----------+-------------------+ | id | name |status |priority |last_registered_day| +----+----------+----------+----------+-------------------+ | 1 | content_1|0 |1 |2020/10/10 11:20:20| | 2 | content_2|2 |1 |2020/10/10 11:21:20| | 3 | content_3|2 |2 |2020/10/10 11:22:20| +----+----------+----------+----------+-------------------+ table clusters -- stores cluster information +----+----------+ | id | name | +----+----------+ | 1 | cluster_1| | 2 | cluster_2| +----+----------+ table content_cluster -- each record indicates that one content is on one cluster +----------+----------+-------------------+ |content_id|cluster_id| last_update_date| +----------+----------+-------------------+ | 1 | 1 |2020-10-01T11:30:00| | 2 | 2 |2020-10-01T11:30:00| | 3 | 1 |2020-10-01T10:30:00| | 3 | 2 |2020-10-01T10:30:00| +----------+----------+-------------------+
By specifying a cluster_id, I want to get one content name at a time where contents.status=2 and (contents name, cluster_id) pair is in content_cluster. The query in sql is something like follow.
SELECT contents.name FROM contents JOIN content_cluster ON contents.content_id = content_cluster.content_id where contents.status = 2 AND content_cluster.cluster_id = <cluster_id> ORDER BY contents.priority , contents.last_registered_day , contents.name LIMIT 1;
However, I don’t want the tables to be joined as a whole every time as I have to do it frequently and the tables are large. Is there any efficient way to do this? I can add some indices to the tables. What should I do?
Advertisement
Answer
I would try writing the query like this:
SELECT c.name FROM contents c WHERE EXISTS (SELECT 1 FROM content_cluster cc WHERE cc.content_id = c.content_id AND cc.cluster_id = <cluster_id> ) AND c.status = 2 ORDER BY c.priority, c.last_registered_day, c.name LIMIT 1;
Then create the following indexes:
content(status, priority, last_registered_day, name, content_id, name)
content_cluster(content_id, cluster_id)
.
The goal is for the execution plan to scan the index for context
and for each row, look up to see if there is a match in content_cluster
. The query stops at the first match.
I can’t guarantee that this will generate that plan (avoiding the sort), but it is worth a try.