Given a simplified table structure like this:
CREATE TABLE t1 ( id INT, num INT, CONSTRAINT t1_pk PRIMARY KEY (id), CONSTRAINT t1_uk UNIQUE (id, num) )
Can I use a subquery like this for inserting records without causing a race condition?
INSERT INTO t1 ( id, num ) VALUES ( 1, ( SELECT MAX(num) + 1 FROM t1 ) )
Or are subqueries not atomic? I’m worried about simultaneous INSERT
s grabbing the same value for num
and then causing a unique constraint violation.
Advertisement
Answer
Yes, this can most certainly create a race condition, because while all statements are guaranteed atomic, this does not require them to have operated across an unchanging data set during the separate parts of the query’s execution.
A client submits your above query. So long as the engine finds the MAX(num)
while holding only locks that are compatible with other readers, then another client can find the same MAX(num)
before the INSERT
is performed.
There are four ways around this problem that I know of:
Use a sequence. In the
INSERT
you can just dosequencename.nextval
to return the next unique number to be inserted.SQL> create sequence t1num; Sequence created. SQL> select t1num.nextval from dual; NEXTVAL ---------- 1 SQL> select t1num.nextval from dual; NEXTVAL ---------- 2
Retry on failure. I read a credible article about a very high transactions-per-second system that had a scenario not exactly like this one but suffering from the same race condition of the
INSERT
possibly using the wrong value. They found that the highest TPS was achieved by first givingnum
a unique constraint, then proceeding as normal, and if theINSERT
was rejected due to a violation of the unique constraint, the client would simply retry.Add a locking hint that forces the engine to block other readers until the
INSERT
is completed. While this may be easy technically, it probably won’t be suitable for high concurrency. If theMAX()
is performed with a single seek, and the blocking is not long and does not block many clients, it could theoretically be acceptable, but most systems grow over time, quickly rendering this risky.Use a separate one-row helper table to record the next/most recent value for
num
. Perform a simultaneous read andUPDATE
on the helper table, then use the read value separately toINSERT
to the main table. In my mind, this has some annoyance of not being a single query, plus it does have the issue that if the client manages to “reserve” a value ofnum
, but then fails for any reason to actually perform theINSERT
, then a gap can occur in the values ofnum
in the table.