Skip to content
Advertisement

How to guarantee atomic SQL inserts with subqueries?

Given a simplified table structure like this:

 CREATE TABLE t1 (
        id INT,
        num INT,
        CONSTRAINT t1_pk
        PRIMARY KEY (id),
        CONSTRAINT t1_uk
        UNIQUE (id, num)
    )

Can I use a subquery like this for inserting records without causing a race condition?

INSERT INTO t1 (
    id,
    num
) VALUES (
    1,
    (
        SELECT MAX(num) + 1
        FROM   t1
    )
)

Or are subqueries not atomic? I’m worried about simultaneous INSERTs grabbing the same value for num and then causing a unique constraint violation.

Advertisement

Answer

Yes, this can most certainly create a race condition, because while all statements are guaranteed atomic, this does not require them to have operated across an unchanging data set during the separate parts of the query’s execution.

A client submits your above query. So long as the engine finds the MAX(num) while holding only locks that are compatible with other readers, then another client can find the same MAX(num) before the INSERT is performed.

There are four ways around this problem that I know of:

  1. Use a sequence. In the INSERT you can just do sequencename.nextval to return the next unique number to be inserted.

    SQL> create sequence t1num;
    
    Sequence created.
    
    SQL> select t1num.nextval from dual;
    
       NEXTVAL
    ----------
             1
    
    SQL> select t1num.nextval from dual;
    
       NEXTVAL
    ----------
             2
    
  2. Retry on failure. I read a credible article about a very high transactions-per-second system that had a scenario not exactly like this one but suffering from the same race condition of the INSERT possibly using the wrong value. They found that the highest TPS was achieved by first giving num a unique constraint, then proceeding as normal, and if the INSERT was rejected due to a violation of the unique constraint, the client would simply retry.

  3. Add a locking hint that forces the engine to block other readers until the INSERT is completed. While this may be easy technically, it probably won’t be suitable for high concurrency. If the MAX() is performed with a single seek, and the blocking is not long and does not block many clients, it could theoretically be acceptable, but most systems grow over time, quickly rendering this risky.

  4. Use a separate one-row helper table to record the next/most recent value for num. Perform a simultaneous read and UPDATE on the helper table, then use the read value separately to INSERT to the main table. In my mind, this has some annoyance of not being a single query, plus it does have the issue that if the client manages to “reserve” a value of num, but then fails for any reason to actually perform the INSERT, then a gap can occur in the values of num in the table.

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement