Given a simplified table structure like this:
CREATE TABLE t1 ( id INT, num INT, CONSTRAINT t1_pk PRIMARY KEY (id), CONSTRAINT t1_uk UNIQUE (id, num) )
Can I use a subquery like this for inserting records without causing a race condition?
INSERT INTO t1 (
id,
num
) VALUES (
1,
(
SELECT MAX(num) + 1
FROM t1
)
)
Or are subqueries not atomic? I’m worried about simultaneous INSERTs grabbing the same value for num and then causing a unique constraint violation.
Advertisement
Answer
Yes, this can most certainly create a race condition, because while all statements are guaranteed atomic, this does not require them to have operated across an unchanging data set during the separate parts of the query’s execution.
A client submits your above query. So long as the engine finds the MAX(num) while holding only locks that are compatible with other readers, then another client can find the same MAX(num) before the INSERT is performed.
There are four ways around this problem that I know of:
Use a sequence. In the
INSERTyou can just dosequencename.nextvalto return the next unique number to be inserted.SQL> create sequence t1num; Sequence created. SQL> select t1num.nextval from dual; NEXTVAL ---------- 1 SQL> select t1num.nextval from dual; NEXTVAL ---------- 2Retry on failure. I read a credible article about a very high transactions-per-second system that had a scenario not exactly like this one but suffering from the same race condition of the
INSERTpossibly using the wrong value. They found that the highest TPS was achieved by first givingnuma unique constraint, then proceeding as normal, and if theINSERTwas rejected due to a violation of the unique constraint, the client would simply retry.Add a locking hint that forces the engine to block other readers until the
INSERTis completed. While this may be easy technically, it probably won’t be suitable for high concurrency. If theMAX()is performed with a single seek, and the blocking is not long and does not block many clients, it could theoretically be acceptable, but most systems grow over time, quickly rendering this risky.Use a separate one-row helper table to record the next/most recent value for
num. Perform a simultaneous read andUPDATEon the helper table, then use the read value separately toINSERTto the main table. In my mind, this has some annoyance of not being a single query, plus it does have the issue that if the client manages to “reserve” a value ofnum, but then fails for any reason to actually perform theINSERT, then a gap can occur in the values ofnumin the table.