I have 2 main table in MYSQL DB. I will call it entityA and entityB.
There are some logic that user will start to create a entityA and then convert it to entityB but some time they will not create entityA and only create entityB
Therefore, I have created a middle table which stored entityAId and entityBId.
entityAId | entityBId |
---|---|
1 | 1 |
2 | null |
null | 2 |
I need to show them in one list view for user. If there are entityB, show the information in entityB table and if no entityB, show entityA information. What current SQL will be like below:
SELECT IFNULL(entityB.colA, entityA.colA), IFNULL(entityB.colB, entity.colB), .... FROM middle LEFT JOIN entityA on middle.entityAId = entityA.id LEFT JOIN entityB on middle.entityBId = entityB.id WHERE (entityB.colA = "XXX" OR entityA.colA= "XXX") AND .... ORDER BY IFNULL(entityB.colA, entityA.colA) LIMIT 20 OFFSET 0
The SQL become slow when the table become bigger as one customer imported a large amount of historical data. Can anybody help how to make it faster? I am not sure if it can use by rewriting the SQL or creating some index on it? Thanks
======== AND SOME MORE FROM COMMENT =========
Currently middle table has no id, I just added a unique for entityAId and entityBId
Below is the Create statement for 3 table:
CREATE TABLE `entityA` ( `id` bigint NOT NULL AUTO_INCREMENT, `colA` varchar(100) DEFAULT NULL, `colB` varchar(100) DEFAULT NULL, PRIMARY KEY (`id`), KEY `idx_entityA_party_group_code` (`colA`), KEY `idx_entityA_booking_no` (`colB`), KEY `idx_entityA_id_partyGroupCode` (`id`,`colA`), KEY `idx_entityA_id_moduleTypeCode` (`id`,`colB`) ) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
CREATE TABLE `entityB` ( `id` bigint NOT NULL AUTO_INCREMENT, `colA` varchar(100) DEFAULT NULL, `colB` varchar(100) DEFAULT NULL, PRIMARY KEY (`id`), KEY `idx_entityB_party_group_code` (`colA`), KEY `idx_entityB_booking_no` (`colB`), KEY `idx_entityB_id_colA` (`id`,`colA`), KEY `idx_entityB_id_colB` (`id`,`colB`) ) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
CREATE TABLE `middle` ( `entityAId` bigint DEFAULT NULL, `entityBId` bigint DEFAULT NULL, UNIQUE KEY `idx_middle_entityAId_entityBId` (`entityAId`,`entityBId`), KEY `idx_middle_entityAId (`entityAId`), KEY `idx_middle_entityBId` (`entityBId`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8
====. ADD explain in JSON =====
{ "query_block": { "select_id": 1, "cost_info": { "query_cost": "717863.40" }, "nested_loop": [ { "table": { "table_name": "middle", "access_type": "index", "key": "idx_middle_entityAId_entityBId", "used_key_parts": [ "entityAId", "entityBId" ], "key_length": "18", "rows_examined_per_scan": 896658, "rows_produced_per_join": 896658, "filtered": "100.00", "using_index": true, "cost_info": { "read_cost": "537.00", "eval_cost": "89665.80", "prefix_cost": "90202.80", "data_read_per_join": "20M" }, "used_columns": [ "entityAId", "entityBId" ] } }, { "table": { "table_name": "entityA", "access_type": "eq_ref", "possible_keys": [ "PRIMARY", "idx_entityA_id_entityANo", "idx_entityA_id_partyGroupCode", "idx_entityA_id_moduleTypeCode" ], "key": "PRIMARY", "used_key_parts": [ "id" ], "key_length": "8", "ref": [ "db_main.middle.entityAId" ], "rows_examined_per_scan": 1, "rows_produced_per_join": 896658, "filtered": "100.00", "cost_info": { "read_cost": "224164.50", "eval_cost": "89665.80", "prefix_cost": "404033.10", "data_read_per_join": "23G" }, "used_columns": [ "id", "partyGroupCode" ] } }, { "table": { "table_name": "entityB", "access_type": "eq_ref", "possible_keys": [ "PRIMARY", "idx_entityB_id_partyGroupCode", "idx_entityB_id_jobNo", "idx_entityB_id_masterNo", "idx_entityB_id_houseNo" ], "key": "PRIMARY", "used_key_parts": [ "id" ], "key_length": "8", "ref": [ "db_main.middle.entityBId" ], "rows_examined_per_scan": 1, "rows_produced_per_join": 896658, "filtered": "100.00", "cost_info": { "read_cost": "224164.50", "eval_cost": "89665.80", "prefix_cost": "717863.40", "data_read_per_join": "27G" }, "used_columns": [ "id", "partyGroupCode" ], "attached_condition": "<if>(found_match(entityB), (((`db_main`.`entityA`.`id` is not null) or (`db_main`.`entityB`.`id` is not null)) and ((`db_main`.`entityB`.`partyGroupCode` = 'DEV') or (`db_main`.`entityA`.`partyGroupCode` = 'DEV'))), true)" } } ] } }
Advertisement
Answer
When you have
PRIMARY KEY(id)
, do not also haveINDEX(id, ...)
. (Space improvement, not speed.)If
middle
is a many-to-many mapping table, I am surprised that either of the columns is declaredNULL
. Make themNOT NULL
and follow the indexing instructions in http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table (Speed and space improvements; hard to predict how much improvement.)OR
does not optimize well. Use aUNION
: (Potentially a large speed improvement)( SELECT IFNULL(entityB.colA, entityA.colA), IFNULL(entityB.colB, entity.colB), .... FROM middle LEFT JOIN entityA ON middle.entityAId = entityA.id LEFT JOIN entityB ON middle.entityBId = entityB.id WHERE entityB.colA = "XXX" AND .... ORDER BY IFNULL(entityB.colA, entityA.colA) LIMIT 20 ) UNION ( SELECT IFNULL(entityB.colA, entityA.colA), IFNULL(entityB.colB, entity.colB), .... FROM middle LEFT JOIN entityA ON middle.entityAId = entityA.id LEFT JOIN entityB ON middle.entityBId = entityB.id WHERE entityA.colA= "XXX" AND .... ORDER BY IFNULL(entityB.colA, entityA.colA) LIMIT 20 ) ORDER BY IFNULL(entityB.colA, entityA.colA) LIMIT 20 OFFSET 0
If you will be increasing the
OFFSET
, see this for how to deal with it: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#orUNION ALL
is faster thanUNION DISTINCT
(which is the default). UseALL
if you “XXX” will not show up in both tables.LEFT
is probably unnecessary. (No speed change.)"rows_examined_per_scan": 896658
may drop to20
!It may be possible to store the data so that the relationship always has A < B. That way, the search could do one Select (without
OR
orUNION
) (double speed, but add complexity) Since you are hiding the “entity”, I cannot say whether it would matter.