Skip to content
Advertisement

Update large table from smaller, mission critical, table without locking small table

In MySQL, I have two innodb tables, a small mission critical table, that needs to be readily available at all times for reads/writes. Call this mission_critical. I have a larger table (>10s of millions of rows), called big_table. I need to update big_table, for instance:

update mission_critical c, big_table b
set
b.something = c.something_else
where b.refID=c.id

This query could take more than an hour, but this creates a write-lock on the mission_critical table. Is there a way I can tell mysql, “I don’t want a lock on mission_critical” so that that table can be written to?

I understand that this is not ideal from a transactional point of view. The only workaround I can think of right now is to make a copy of the small mission_critical table and do the update from that (which I don’t care gets locked), but I’d rather not do that if there’s a way to make MySQL natively deal with this more gracefully.

It is not the table that is locking but all of the records in mission_critical that are locked, since they are basically all scanned by the update. I am not assuming this; the symptom is that when a user logs in to an online system, it tries to update a datetime column in mission_critical to update the last time they logged in. These queries die due to a Lock wait timeout exceeded error while the query above is running. If I kill the query above, all pending queries run immediately.

mission_critical.id and big_table.refID are both indexed.

The pertinent portions of the creation statements for each table is:

mission_critical:

CREATE TABLE `mission_critical` (
`intID` int(11) NOT NULL AUTO_INCREMENT,
`id` smallint(6) DEFAULT NULL,
`something_else` varchar(50) NOT NULL,
`lastLoginDate` datetime DEFAULT NULL,
PRIMARY KEY (`intID`),
UNIQUE KEY `id` (`id`),
UNIQUE KEY `something_else` (`something_else`),
) ENGINE=InnoDB AUTO_INCREMENT=1432 DEFAULT CHARSET=latin1

big_table:

CREATE TABLE `big_table` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`postDate` date DEFAULT NULL,
`postTime` int(11) DEFAULT NULL,
`refID` smallint(6) DEFAULT NULL,
`something` varchar(50) NOT NULL,
`change` decimal(20,2) NOT NULL
PRIMARY KEY (`id`),
KEY `refID` (`refID`),
KEY `postDate` (`postDate`),
) ENGINE=InnoDB AUTO_INCREMENT=138139125 DEFAULT CHARSET=latin1

The explanation of the query is:

+----+-------------+------------------+------------+------+---------------+-------+---------+------------------------------------+------+----------+-------------+
| id | select_type |      table       | partitions | type | possible_keys |  key  | key_len |                ref                 | rows | filtered |    Extra    |
+----+-------------+------------------+------------+------+---------------+-------+---------+------------------------------------+------+----------+-------------+
|  1 | SIMPLE      | mission_critical |            | ALL  | id            |       |         |                                    |  774 |      100 | Using where |
|  1 | UPDATE      | big_table        |            | ref  | refID         | refID |       3 | db.mission_critical.something_else | 7475 |      100 |             |
+----+-------------+------------------+------------+------+---------------+-------+---------+------------------------------------+------+----------+-------------+

Advertisement

Answer

I first suggested a workaround with a subquery, to create a copy in an internal temporary table. But in my test the small table was still locked for writes. So I guess your best bet is to make a copy manually.

The reason for the lock is described in this bug report: https://bugs.mysql.com/bug.php?id=72005

This is what Sinisa Milivojevic wrote in an answer:

update table t1,t2 ....

any UPDATE with a join is considered a multiple-table update. In that case, a referenced table has to be read-locked, because rows must not be changed in the referenced table during UPDATE until it has finished. There can not be concurrent changes of the rows, nor DELETE of the rows, nor, much less, exempli gratia any DDL on the referenced table. The goal is simple, which is to have all tables with consistent contents when UPDATE finishes, particularly since multiple-table UPDATE can be executed with several passes.

In short, this behavior is for a good reason.

Consider writing INSERT and UPDATE triggers, which will update the big_table on the fly. That would delay writes on the mission_critical table. But it might be fast enough for you, and wouldn’t need the mass-update-query any more.

Also check if it wouldn’t be better to use char(50) instead of varchar(50). I’m not sure, but it’s possible that it will improve the update performance because the row size wouldn’t need to change. I could improve the update performance about 50% in a test.

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement