Skip to content
Advertisement

get one record at a time from joint table

I want to get a record from a joint table at a time. But I don’t hope the tables are joined as a whole.

The actual tables are as follow.

table contents -- stores content information.
+----+----------+----------+----------+-------------------+
| id |  name    |status    |priority  |last_registered_day|
+----+----------+----------+----------+-------------------+
|  1 | content_1|0         |1         |2020/10/10 11:20:20|
|  2 | content_2|2         |1         |2020/10/10 11:21:20|
|  3 | content_3|2         |2         |2020/10/10 11:22:20|
+----+----------+----------+----------+-------------------+

table clusters -- stores cluster information
+----+----------+
| id |  name    |
+----+----------+
|  1 | cluster_1|
|  2 | cluster_2|
+----+----------+

table content_cluster -- each record indicates that one content is on one cluster
+----------+----------+-------------------+
|content_id|cluster_id|   last_update_date|
+----------+----------+-------------------+
|        1 |        1 |2020-10-01T11:30:00|
|        2 |        2 |2020-10-01T11:30:00|
|        3 |        1 |2020-10-01T10:30:00|
|        3 |        2 |2020-10-01T10:30:00|
+----------+----------+-------------------+

By specifying a cluster_id, I want to get one content name at a time where contents.status=2 and (contents name, cluster_id) pair is in content_cluster. The query in sql is something like follow.

SELECT contents.name 
  FROM contents
  JOIN content_cluster
    ON contents.content_id = content_cluster.content_id
 where contents.status = 2 
   AND content_cluster.cluster_id = <cluster_id>
 ORDER 
    BY contents.priority
     , contents.last_registered_day
     , contents.name
 LIMIT 1;

However, I don’t want the tables to be joined as a whole every time as I have to do it frequently and the tables are large. Is there any efficient way to do this? I can add some indices to the tables. What should I do?

Advertisement

Answer

I would try writing the query like this:

SELECT c.name 
FROM contents c
WHERE EXISTS (SELECT 1
              FROM content_cluster cc
              WHERE cc.content_id = c.content_id AND
                    cc.cluster_id = <cluster_id>
             ) AND
      c.status = 2 
 ORDER BY c.priority, c.last_registered_day, c.name
 LIMIT 1;

Then create the following indexes:

  • content(status, priority, last_registered_day, name, content_id, name)
  • content_cluster(content_id, cluster_id).

The goal is for the execution plan to scan the index for context and for each row, look up to see if there is a match in content_cluster. The query stops at the first match.

I can’t guarantee that this will generate that plan (avoiding the sort), but it is worth a try.

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement